Added an index for tweet_id in the _media tables for new bins#303
Added an index for tweet_id in the _media tables for new bins#303xmacex wants to merge 5 commits into
Conversation
…ns. Fixes digitalmethodsinitiative#299, for new bins.
…erous, suggesting modifying existing tables, so someone please watch after me :)
…ns. Honestly speaking, I am cannibalizing the '24/05/2016 Fix index of tweet_id on withheld tables', commit 16110ec, which looks most like the update I suggest here.
…mi-tcat into exports_with_media_slow
…d3bd27, which looks like it might have caused the export with media slowness by removing the index for tweet_id, but instead I am building incrementally on top of all the upgrades, in chronological order of this script
|
LOL please someone review my code for the upgrade script. There's a couple of layers of conditional logic and I am not entirely confident about the contexts in which the upgrade script run. I feel that what I propose could be simplified, but I'm sticking close to code I cannibalized from 16110ec. Of course I tested it on my toy installation, and it seems to work, but the upgrade path is long and I didn't test all execution paths though I hope I reasoned myself through all of them... a refactoring opportunity maybe? I also made an inappropriate change in f0df436 by poking around an earlier 2015 change, but cancelled the change in 21c6a8b after I figured out the logic of the upgrade script, and wrote what I consider a more appropriate change by imitating what the upgrade script seems to be doing, ie. layering changes upon one another. Sorry and thank you 😺 |
|
Thanks @xmacex and sorry for the long time to review! The code looks solid! Would it be possible to update the upgrade.php, so that your change can be rolled out? |
Fixes #299 for new bins, by considerably speeding up the
queries which are done from
analysis/mod.export_tweets.phpto look up fields from the media table.Does not solve the issue with already existing bins, for which the same index must be created. I wrote following Python program (improvements welcome of course), but am unsure how to best contribute something similar to TCAT: