Desktop App: SQLite Database Locked by Background Tasks

There's currently these tasks run asynchronously in the background.

updating and publishing local repository
updating remote repositories (e.g. f-droid.org or guardianproject)
when adding an app from a remote repository

downloading the APK for the remote app
downloading remote graphic assets for the remote app
downloading app screenshots for the remote app

While they are run, the SQLite database is typically locked for write operations. When the user tries to do anything that requires writing to the database during this time, they are currently presented with this error page:

Some of the above background tasks can take a very long time. E.g. updating the f-droid.org repo (2) initially can take several hours (partly due to admin#35).

This is a really bad user experience (especially on first app start), so we need to solve this problem. Below possible solutions are sketched out.

Use a database that supports concurrency

Other databases such as MySQL or PostgreSQL are a lot better at dealing with concurrency. The problem practically doesn't exist there. However, requiring such databases makes the deployment procedure much more complex and the memory footprint a lot larger. Therefore, this should be avoided if other solutions are possible.

A way forward with this might be to deploy the app as flatpak or snappy image (#160 (closed)) that comes with a database pre-configured.

Shorten the time the database is locked

At the moment, it looks like the first write operation locks the database and only releases the lock once the background tasks finishes. Maybe, there is a way to release the lock earlier, ideally after every write operation and then let the task acquire it again.

This would still not be an optimal solution, because the database would still get locked a lot in short intervals (2). The user might be lucky and get their operation done by pressing Try Again. However, it might also happen that a write operation initiated by the user prevents a running background task from acquiring a lock and thus stop the processing of background tasks entirely. That is a scenario that should be avoided by preventing the background processor from crashing and by making it try again later.

The lock hold by task (1) could be shortened more easily, because there is only one write operation happening at the beginning and one at the end of the task. Both do not need to be atomic.

Allow the user to control background tasks

Run and stop all tasks on demand

When running repomaker as a GUI application in a WebView, it might be possible to add an extra button and status indicator for background tasks. This would allow the user to stop the background tasks for as long as they need to work with their repository.

The downside is that this is also not a good user experience as it makes using repomaker more difficult. The user gets error messages that require understanding the complexities of database locking and gets told to click buttons in various situations. A local repository for example does not get published (1) when the background tasks are not running.

Run tasks synchronously and let user wait

All current background tasks could also be run in the foreground. A local repository would only get published (1) when the user presses a to be added Publish button. Then the user needs to wait until the publishing completes.

Similarly, there could be a button to update remote repositories (2). The downside of this is that it can take a long time to update those and users would rarely update them. This would lead to outdated repository information and errors when the user tries to add apps from the remote repositories, because the referenced apps, their APKs or graphic assets might not be available anymore.

Adding apps from remote repositories (3) could only complete once all remote data has been downloaded. We might run into issues with HTTP requests timing out here.

Overall, these user-centric solution would only advance the status quo a little bit by making it clearer to the users why they need to wait, but introduce other problems such as outdated information.

Use several databases

With Django it is possible to use more than one database. The most problematic models related to remote repositories (2) could be moved to this database, so that they can be locked while the user can freely interact with their local repositories.

However, it is not possible to have relationships across databases and repomaker's current database design uses these relationships:

RemoteRepository references User
RemoteApp references Category
RemoteApkPointer references Apk

It does not seem to be possible to split the models without breaking Django's ORM and manually gluing things back together by using the object's primary keys and trying to keep things in sync between the two databases.

However, the documentation says that in the case of SQLite

there is no enforced referential integrity; as a result, you may be able to ‘fake’ cross database foreign keys. However, this configuration is not officially supported by Django.

So we might get away with doing a split. The full consequences of this are still unknown. However, in single-user desktop mode and the current feature set, there's no other users being added and removed. The same goes for categories. So the problem space might be limited. The only exception is with Apk that are for example removed when not referenced anymore. Checking the references might not work reliably across databases.

Also, later (e.g. for #15 where App would reference RemoteApp) we might want to introduce new inter-database references. These would then increase the risk of things going wrong. So this route should only be taken with great care and some prior tests of how things behave with a split database.

Even if the database can be split into two, this would to resolve background tasks (1) and (3). However, possible solutions for those have been sketched above.

Admin message