[Computer-go] CNN with 54% prediction on KGS 6d+ data
Michael Markefka
michael.markefka at gmail.com
Wed Dec 9 05:14:32 PST 2015
I think ko moves are taken into account on one of in the input planes
for most configurations. At least I hope remember that correctly.
Could it be achieved to create such a plane from the prior input
matrix and following output matrix by difference?
On Wed, Dec 9, 2015 at 2:08 PM, Igor Polyakov
<weiqiprogramming at gmail.com> wrote:
> I doubt that the illegal moves would fall away since every professional
> would retake the ko... if it was legal
>
>
> On 2015-12-09 4:59, Michael Markefka wrote:
>>
>> Thank you for the feedback, everyone.
>>
>>
>> Regarding the CPU-GPU roundtrips, I'm wondering whether it'd be
>> possible to recursively apply the output matrix to the prior input
>> matrix to update board positions within the GPU and without any
>> actual (possibly CPU-based) evaluation until all branches come up with
>> game ending states. I assume illegal moves would mostly fall away when
>> sticking to the top ten or top five move considerations provided by
>> the CNN.
>>
>> As for performance, I could imagine initialization being relatively
>> slow, but wouldn't be surprised if the GPU-based CNN performance could
>> offer a branch size, running through many parallel boards with
>> comparatively minor performance impact, where this outweighed the
>> initial overhead again.
>>
>> Whether this would provide a better evaluation function than MCTS I
>> don't know, but just like Alvaro I would love to see this tried, even
>> if just to rule it out for the moment.
>>
>>
>> I've got a GTX 980 Ti on a 4790k with 16 GB at home. For a low key
>> test I could run Windows (CUDA installed and running, tested with
>> pylearn2) or Ubuntu from a live setup on USB and would be willing to
>> run test code, if somebody provided a package I could simply download
>> and execute.
>>
>>
>> All the best
>>
>> Michael
>>
>>
>> On Tue, Dec 8, 2015 at 7:52 PM, Álvaro Begué <alvaro.begue at gmail.com>
>> wrote:
>>>
>>> Of course whether these "neuro-playouts" are any better than the heavy
>>> playouts currently being used by strong programs is an empirical
>>> question.
>>> But I would love to see it answered...
>>>
>>>
>>>
>>> On Tue, Dec 8, 2015 at 1:31 PM, David Ongaro <david.ongaro at hamburg.de>
>>> wrote:
>>>>
>>>> Did everyone forget the fact that stronger playouts don't necessarily
>>>> lead
>>>> to an better evaluation function? (Yes, that what playouts essential
>>>> are, a
>>>> dynamic evaluation function.) This is even under the assumption that we
>>>> can
>>>> reach the same number of playouts per move.
>>>>
>>>>
>>>> On 08 Dec 2015, at 10:21, Álvaro Begué <alvaro.begue at gmail.com> wrote:
>>>>
>>>> I don't think the CPU-GPU communication is what's going to kill this
>>>> idea.
>>>> The latency in actually computing the feed-forward pass of the CNN is
>>>> going
>>>> to be in the order of 0.1 seconds (I am guessing here), which means
>>>> finishing the first playout will take many seconds.
>>>>
>>>> So perhaps it would be interesting to do something like this for
>>>> correspondence games, but not for regular games.
>>>>
>>>>
>>>> Álvaro.
>>>>
>>>>
>>>>
>>>> On Tue, Dec 8, 2015 at 12:03 PM, Petr Baudis <pasky at ucw.cz> wrote:
>>>>>
>>>>> Hi!
>>>>>
>>>>> Well, for this to be practical the entire playout would have to be
>>>>> executed on the GPU, with no round-trips to the CPU. That's what my
>>>>> email was aimed at.
>>>>>
>>>>> On Tue, Dec 08, 2015 at 04:37:05PM +0000, Josef Moudrik wrote:
>>>>>>
>>>>>> Regarding full CNN playouts, I think that problem is that a playout is
>>>>>> a
>>>>>> long serial process, given 200-300 moves a game. You need to construct
>>>>>> planes and transfer them to GPU for each move and read result back (at
>>>>>> least with current CNN implementations afaik), so my guess would be
>>>>>> that
>>>>>> such playout would take time in order of seconds. So there seems to be
>>>>>> a
>>>>>> tradeoff, CNN playouts are (probably much) better (at "playing better
>>>>>> games") than e.g. distribution playouts, but whether this is worth the
>>>>>> implied (probably much) lower height of the MC tree is a question.
>>>>>>
>>>>>> Maybe if you had really a lot of GPUs and very high thinking time,
>>>>>> this
>>>>>> could be the way.
>>>>>>
>>>>>> Josef
>>>>>>
>>>>>> On Tue, Dec 8, 2015 at 5:17 PM Petr Baudis <pasky at ucw.cz> wrote:
>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> In case someone is looking for a starting point to actually
>>>>>>> implement
>>>>>>> Go rules etc. on GPU, you may find useful:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> https://www.mail-archive.com/computer-go@computer-go.org/msg12485.html
>>>>>>>
>>>>>>> I wonder if you can easily integrate caffe GPU kernels in another
>>>>>>> GPU
>>>>>>> kernel like this? But without training, reimplementing the NN could
>>>>>>> be
>>>>>>> pretty straightforward.
>>>>>>>
>>>>>>> On Tue, Dec 08, 2015 at 04:53:14PM +0100, Michael Markefka wrote:
>>>>>>>>
>>>>>>>> Hello Detlef,
>>>>>>>>
>>>>>>>> I've got a question regarding CNN-based Go engines I couldn't find
>>>>>>>> anything about on this list. As I've been following your posts
>>>>>>>> here, I
>>>>>>>> thought you might be the right person to ask.
>>>>>>>>
>>>>>>>> Have you ever tried using the CNN for complete playouts? I know
>>>>>>>> that
>>>>>>>> CNNs have been tried for move prediction, immediate scoring and
>>>>>>>> move
>>>>>>>> generation to be used in an MC evaluator, but couldn't find
>>>>>>>> anything
>>>>>>>> about CNN-based playouts.
>>>>>>>>
>>>>>>>> It might only be feasible to play out the CNN's first choice move
>>>>>>>> for
>>>>>>>> evaluation purposes, but considering how well the performance of
>>>>>>>> batch
>>>>>>>> sizes scales, especially on GPU-based CNN applications, it might be
>>>>>>>> possible to setup something like 10 candidate moves, 10 reply
>>>>>>>> candidate moves and then have the CNN play out the first choice
>>>>>>>> move
>>>>>>>> for those 100 board positions until the end and then sum up scores
>>>>>>>> again for move evaluation (and/or possibly apply some other tried
>>>>>>>> and
>>>>>>>> tested methods like minimax). Given that the number of 10 moves is
>>>>>>>> supposed to be illustrative rather than representative, other
>>>>>>>> configurations of depth and width in position generation and
>>>>>>>> evaluation would be possible.
>>>>>>>>
>>>>>>>> It feels like CNN can provide a very focused, high-quality width in
>>>>>>>> move generation, but it might also be possible to apply that
>>>>>>>> quality
>>>>>>>> to depth of evaluation.
>>>>>>>>
>>>>>>>> Any thoughts to share?
>>>>>>>>
>>>>>>>>
>>>>>>>> All the best
>>>>>>>>
>>>>>>>> Michael
>>>>>>>>
>>>>>>>> On Tue, Dec 8, 2015 at 4:13 PM, Detlef Schmicker <ds2 at physik.de>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>>>>> Hash: SHA1
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> as somebody ask I will offer my actual CNN for testing.
>>>>>>>>>
>>>>>>>>> It has 54% prediction on KGS 6d+ data (which I thought would be
>>>>>>>>> state
>>>>>>>>> of the art when I started training, but it is not anymore:).
>>>>>>>>>
>>>>>>>>> it has:
>>>>>>>>> 1
>>>>>>>>> 2
>>>>>>>>> 3
>>>>>>>>>>
>>>>>>>>>> 4 libs playing color
>>>>>>>>>
>>>>>>>>> 1
>>>>>>>>> 2
>>>>>>>>> 3
>>>>>>>>>>
>>>>>>>>>> 4 libs opponent color
>>>>>>>>>
>>>>>>>>> Empty points
>>>>>>>>> last move
>>>>>>>>> second last move
>>>>>>>>> third last move
>>>>>>>>> forth last move
>>>>>>>>>
>>>>>>>>> input layers, and it is fully convolutional, so with just editing
>>>>>>>>> the
>>>>>>>>> golast19.prototxt file you can use it for 13x13 as well, as I did
>>>>>>>>> on
>>>>>>>>> last sunday. It was used in November tournament as well.
>>>>>>>>>
>>>>>>>>> You can find it
>>>>>>>>> http://physik.de/CNNlast.tar.gz
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you try here some points I like to get discussion:
>>>>>>>>>
>>>>>>>>> - - it seems to me, that the playouts get much more important
>>>>>>>>> with such
>>>>>>>>> a strong move prediction. Often the move prediction seems better
>>>>>>>>> the
>>>>>>>>> playouts (I use 8000 at the moment against pachi 32000 with about
>>>>>>>>> 70%
>>>>>>>>> winrate on 19x19, but with an extremely focused progressive
>>>>>>>>> widening
>>>>>>>>> (a=400, a=20 was usual).
>>>>>>>>>
>>>>>>>>> - - live and death becomes worse. My interpretation is, that the
>>>>>>>>> strong
>>>>>>>>> CNN does not play moves, which obviously do not help to get a
>>>>>>>>> group
>>>>>>>>> life, but would help the playouts to recognize the group is dead.
>>>>>>>>> (http://physik.de/example.sgf top black group was with weaker
>>>>>>>>> move
>>>>>>>>> prediction read very dead, with good CNN it was 30% alive or so
>>>>>>>>> :(
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OK, hope you try it, as you know our engine oakfoam is open
>>>>>>>>> source :)
>>>>>>>>> We just merged all the CNN stuff into the main branch!
>>>>>>>>> https://bitbucket.org/francoisvn/oakfoam/wiki/Home
>>>>>>>>> http://oakfoam.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Do the very best with the CNN
>>>>>>>>>
>>>>>>>>> Detlef
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> code:
>>>>>>>>> if (col==Go::BLACK) {
>>>>>>>>> for (int j=0;j<size;j++)
>>>>>>>>> for (int k=0;k<size;k++)
>>>>>>>>> {
>>>>>>>>> for (int l=0;l<caffe_test_net_input_dim;l++)
>>>>>>>>> data[l*size*size+size*j+k]=0;
>>>>>>>>> //fprintf(stderr,"%d %d %d\n",i,j,k);
>>>>>>>>> int pos=Go::Position::xy2pos(j,k,size);
>>>>>>>>> int libs=0;
>>>>>>>>> if (board->inGroup(pos))
>>>>>>>>> libs=board->getGroup(pos)->numRealLibs()-1;
>>>>>>>>> if (libs>3) libs=3;
>>>>>>>>> if (board->getColor(pos)==Go::BLACK)
>>>>>>>>> {
>>>>>>>>> data[(0+libs)*size*size + size*j +
>>>>>>>>> k]=1.0;
>>>>>>>>> //data[size*size+size*j+k]=0.0;
>>>>>>>>> }
>>>>>>>>> else if (board->getColor(pos)==Go::WHITE)
>>>>>>>>> {
>>>>>>>>> //data[j*size+k]=0.0;
>>>>>>>>> data[(4+libs)*size*size + size*j +
>>>>>>>>> k]=1.0;
>>>>>>>>> }
>>>>>>>>> else if
>>>>>>>>> (board->getColor(Go::Position::xy2pos(j,k,size))==Go::EMPTY)
>>>>>>>>> {
>>>>>>>>> data[8*size*size + size*j + k]=1.0;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> if (col==Go::WHITE) {
>>>>>>>>> for (int j=0;j<size;j++)
>>>>>>>>> for (int k=0;k<size;k++)
>>>>>>>>> {//fprintf(stderr,"%d %d %d\n",i,j,k);
>>>>>>>>> for (int l=0;l<caffe_test_net_input_dim;l++)
>>>>>>>>> data[l*size*size+size*j+k]=0;
>>>>>>>>> //fprintf(stderr,"%d %d %d\n",i,j,k);
>>>>>>>>> int pos=Go::Position::xy2pos(j,k,size);
>>>>>>>>> int libs=0;
>>>>>>>>> if (board->inGroup(pos))
>>>>>>>>> libs=board->getGroup(pos)->numRealLibs()-1;
>>>>>>>>> if (libs>3) libs=3;
>>>>>>>>> if (board->getColor(pos)==Go::BLACK)
>>>>>>>>> {
>>>>>>>>> data[(4+libs)*size*size + size*j +
>>>>>>>>> k]=1.0;
>>>>>>>>> //data[size*size+size*j+k]=0.0;
>>>>>>>>> }
>>>>>>>>> else if (board->getColor(pos)==Go::WHITE)
>>>>>>>>> {
>>>>>>>>> //data[j*size+k]=0.0;
>>>>>>>>> data[(0+libs)*size*size + size*j +
>>>>>>>>> k]=1.0;
>>>>>>>>> }
>>>>>>>>> else if (board->getColor(pos)==Go::EMPTY)
>>>>>>>>> {
>>>>>>>>> data[8*size*size + size*j + k]=1.0;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> if (caffe_test_net_input_dim > 9) {
>>>>>>>>> if (board->getLastMove().isNormal()) {
>>>>>>>>> int
>>>>>>>>> j=Go::Position::pos2x(board->getLastMove().getPosition(),size);
>>>>>>>>> int
>>>>>>>>> k=Go::Position::pos2y(board->getLastMove().getPosition(),size);
>>>>>>>>> data[9*size*size+size*j+k]=1.0;
>>>>>>>>> }
>>>>>>>>> if (board->getSecondLastMove().isNormal()) {
>>>>>>>>> int
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> j=Go::Position::pos2x(board->getSecondLastMove().getPosition(),size);
>>>>>>>>> int
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> k=Go::Position::pos2y(board->getSecondLastMove().getPosition(),size);
>>>>>>>>> data[10*size*size+size*j+k]=1.0;
>>>>>>>>> }
>>>>>>>>> if (board->getThirdLastMove().isNormal()) {
>>>>>>>>> int
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> j=Go::Position::pos2x(board->getThirdLastMove().getPosition(),size);
>>>>>>>>> int
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> k=Go::Position::pos2y(board->getThirdLastMove().getPosition(),size);
>>>>>>>>> data[11*size*size+size*j+k]=1.0;
>>>>>>>>> }
>>>>>>>>> if (board->getForthLastMove().isNormal()) {
>>>>>>>>> int
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> j=Go::Position::pos2x(board->getForthLastMove().getPosition(),size);
>>>>>>>>> int
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> k=Go::Position::pos2y(board->getForthLastMove().getPosition(),size);
>>>>>>>>> data[12*size*size+size*j+k]=1.0;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> -----BEGIN PGP SIGNATURE-----
>>>>>>>>> Version: GnuPG v2.0.22 (GNU/Linux)
>>>>>>>>>
>>>>>>>>> iQIcBAEBAgAGBQJWZvOlAAoJEInWdHg+Znf4t8cP/2a9fE7rVb3Hz9wvdMkvVkFS
>>>>>>>>> 4Y3AomVx8i56jexVyXuzKihfizVRM7x6lBiwjYBhj4Rm9UFWjj2ZvDzBGCm3Sy4I
>>>>>>>>> SpG8D01VnzVR6iC1YTu3ecv9Wo4pTjc7NL5pAxiZDB0V7OTRklfZAYsX4mWyHygn
>>>>>>>>> cr1pIb79/9QfBf/johmuutXJIwYfVG9ShR1+udbxs3aU3QDAbJJ4eTs8oj+NqFpg
>>>>>>>>> JolEEEg3wY693e77SqbUbjxR3kSsysoz9h1nKnR/ZjHByqlwNvSz9ho9eU0rKhaK
>>>>>>>>> GSQ22/c1VPIZhr24FYBbYNYweOzDtonLpuUFCPSnYVels3h/I/LlqV3MeDo6wuZ2
>>>>>>>>> QCPp5+11o4JzvEt7A4zfJCtEOEH0W2/+IjRcIkAVOo65OV/pPsz2EjHehMU6PC6m
>>>>>>>>> vXA/kPx0jqUm1qSb0qCgMq5ZvSqfpcCY7JOlkEwkDBS1fty9sU0hqst3zXR0KGtn
>>>>>>>>> rFuoREmQYi/mkjZfS2Q4AHiZUDbDZUKzRegUA+gR/eKAmJsmWeTDEI9ZAXgxL0cB
>>>>>>>>> p1HGBNDEUKGk+ruq0gIe5vYygyBcJV0BbbBnweDjeZnlG8vLUAVoMF6V/q3gkZb1
>>>>>>>>> P61rfE4d9dohfGBsZ+UWltRyWMj09ieR2G2zCDpIXyxEuoV6CTAlLzDuhmqFa2ma
>>>>>>>>> Fp3lK/uLhOucXwBtStdx
>>>>>>>>> =E47K
>>>>>>>>> -----END PGP SIGNATURE-----
>>>>>>>>> _______________________________________________
>>>>>>>>> Computer-go mailing list
>>>>>>>>> Computer-go at computer-go.org
>>>>>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Computer-go mailing list
>>>>>>>> Computer-go at computer-go.org
>>>>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>>>>
>>>>>>> --
>>>>>>> Petr Baudis
>>>>>>> If you have good ideas, good data and fast computers,
>>>>>>> you can do almost anything. -- Geoffrey Hinton
>>>>>>> _______________________________________________
>>>>>>> Computer-go mailing list
>>>>>>> Computer-go at computer-go.org
>>>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>>>
>>>>>> _______________________________________________
>>>>>> Computer-go mailing list
>>>>>> Computer-go at computer-go.org
>>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>>
>>>>>
>>>>> --
>>>>> Petr Baudis
>>>>> If you have good ideas, good data and fast computers,
>>>>> you can do almost anything. -- Geoffrey Hinton
>>>>> _______________________________________________
>>>>> Computer-go mailing list
>>>>> Computer-go at computer-go.org
>>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>
>>>>
>>>> _______________________________________________
>>>> Computer-go mailing list
>>>> Computer-go at computer-go.org
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Computer-go mailing list
>>>> Computer-go at computer-go.org
>>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>>
>>>
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go at computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
>
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
More information about the Computer-go
mailing list