Skip to content

Issue 568 removing apks

Old process

The way the update process works is:

  • Collect metadata from index.
  • Copy the app/apk/app-category-join tables from the main database on disk to a temporary database (for performance reasons)
  • For each 50 apps in the index, save them to the temporary database
  • When finished, delete all the app/apk/app-category-join data from main database
  • Then copy data from the temp DB to the main database

New process

In order to fix #568 (closed), the process has been amended so now:

  • When creating the temp database, only those rows which don't belong to the repo being updated are copied over (we are effectively deleting all the old data, so that it can be re-added anew)
  • Instead of deleting all the app/apk/app-category-join rows from the main database before copying, we only remove those belonging to the repo being updated (to make space for the new data we just collected).
  • When copying the temp data into the main database, only copy the data belonging to the repo being updated.

It seems a bit silly copying all of the data from other-non-related repos when we go from main DB -> temp DB, but I think it is required so that we reserve all of the sqlite rowid's for those rows. Otherwise, if we just left an empty temp database and started inserting new data, it would start with ID's of 1, which already exist in the main DB.

Performance

This comes with the added benefit of no longer having to check for each app/apk, whether it:

  • Already exists (and hence we need to UPDATE it).
  • Doesn't yet exist (and hence we need to INSERT it).

This resulted in a 50% performance gain during my testing (120 seconds -> 60 seconds for the main repo on my MotoX 2nd Gen, and 210 seconds -> 105 seconds for the archive).

Better still, I noticed that there was another query which was executed way more times than it needed to be. That is, for each app being updated we have a String category name. This needs to be changed to a long (i.e. an ID of the corresponding row in fdroid_category). To do this, it was querying the fdroid_category table at least once per app. I've made a change to cache the result of this query, because we never delete rows from fdroid_category so their ID should never change.

This drastically increased performance even further, down to 15 seconds for the main repo, and 30 seconds for the archive. In total, we're looking at about an order of magnitude less time for each update than before.

Considerations

This is quite a large change. Not only does it remove a lot of now-unneeded code, it also changes the way the update process saves data to the database in a non-trivial way. However, this is probably the most well-tested part of the code, and all tests are passing. The tests were indeed helpful during development, telling me when I had an error in my queries straight away, and so I am quietly confident that this works exactly as expected. My testing with F-Droid/F-Droid Archive/GP seems to indicate that all is well.

Fixes #568 (closed).

Merge request reports