Sublime directory Surf the web anonymous Pagerank Monitor


Big daily database update

Mind_nl
Wed 13 December 2006, 07:56 pm GMT +0100
Now that I know how to download a file with curl I have to process the file and update my database with the values of the new download. The process will be updating about 5 to 20 thousand records in one table and adding 20 thousand new records to another.

Just processing the 7.5 MB file and displaying the records on screen takes about 1.5 minutes. Does anybody have any idea on what I can expect on how long the updating of the database could take?

A couple of minutes is not too bad, but visitors could see some weird things when they would visit during the update, so my idea is this: if I keep 2 versions of the database, the visitors will see the first while the second is being updated. After the update has finished I can have the site switch over to use the updated database and the update process can run again on the first database. This way, no matter how long the update takes, the visitors will never encounter a database that is in the process of being updated.

Am I taking this too far? Is there a better way to handle this? Should I just do the update and forget about the potential visitor during the update?

Nikolas
Wed 13 December 2006, 08:23 pm GMT +0100
You can hold somewhere a variable - or a record in the database - that the update is in process.

With that way you can easilly alert the user that the update is on the run.

Mind_nl
Wed 13 December 2006, 08:30 pm GMT +0100
That is how I was planning on switching between the two databases, by storing the current live database name in a table. I think I'll just do the two database thing, just for the extra geek-factor ;)

btw: congrats on your 5000th post!

Nikolas
Wed 13 December 2006, 08:36 pm GMT +0100
Yeah but this is not the correct way to do this.

Renaming the database all the time may cause other problems. For example what happens if the user is coming to the site the time that you are renaming the database? :) Plus it will create some overhead to the database and the filesystem.

BTW I've just realized that I have made almost 1/4 of the posts in that forum! I think I have to slow down....

Mind_nl
Wed 13 December 2006, 08:54 pm GMT +0100
I'm not renaming the database. The plan is to have two databases, DBa and DBb then in the config table the name of the current live database is stored. Whenever a page is requested I first read the config table, where I find DBa or DBb and then read the data from the correct database.

Nikolas
Wed 13 December 2006, 08:55 pm GMT +0100
Oh, that's not a bad idea. It will just need some more space, but who cares, right? :)

Mind_nl
Wed 13 December 2006, 08:58 pm GMT +0100
I still have a few GB free on the server so thats not a problem at all.

olaf
Wed 13 December 2006, 09:31 pm GMT +0100
I think that is a good idea, just update some field with database name after the update is ready.

Nico updating a lot of files is very time consuming and it could be also a problem of timeouts, try to handle the data a different way like using replace for example.


Archive for SMF v1.00 by N.P. Valid XHTML 1.0 Transitional