![]() (I have a hunch this may be the main issue) Perhaps the index is so full of junk (endnotes, text fragments, references to book pages) rather than “real content” (sentences about rabbit ears, etc) and the encoder just throws up its hands in despair because it can’t make any sense of most of it.Sure, it’s just a few PDFs but it’s broken down into several thousand chunks, many of which are full sentences or images) Perhaps the dataset itself is just too small (I don’t believe this.Perhaps the encoder or model itself isn’t great (I’m not convinced of this - CLIP isn’t ideal for text, but it’s serviceable at least, as can be seen in our fashion search).On the plus side, it did bring up stuff about rabbits instead of chocolate, but that’s all that can be said for it. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |