Or, apples vs. papaya salad.
Short version: I’ve moved Tropology to PostgreSQL for performance reasons, and because after some evaluation, Neo4j wasn’t as good a fit as it seemed at first blush.
You may want to start by catching up with the other Tropology articles.
Episode IV: A new approach
We got better performance with the last changes I made to the import process, at the cost of parallelization and a significantly less clean Clojure codebase.
As I was going about wrapping them up, I got a recommendation from Michael Hunger. Summarized:
- Don’t use the batch REST API
- The Cypher query endpoint is outdated
- Try merges
Well then. So much for that feature branch.
The story so far
Yesterday we were discussing our (admittedly) somewhat ghetto, not-quite-batched Neo4j implementation.
The long and short of it is that I was initially attacking the import process as one would on a JDBC client for a relational database. Query for these values here, create those if some don’t exist, insert relationships here, etc.
That seems to be woefully inefficient in Neo4j.
Welcome. You may want to first catch up on Part 1 of this little experiment.
All done? OK then.
This is the first part of a series of articles on a small experiment I’m building. The intent is to crawl TVTropes, to find possible relationships between tropes and the material they appear in, as well as shared concepts.
I want to be able to not only visualize a concept and the tropes that it links to, but also how they relate to each other. A bonus would be being able to query how far away a concept is from another - for instance, how many steps we need to take before we can go from Cowboy Bebop to Macross Missile Massacre.