Message indexing as separate process

Use this forum if you want to discuss a problem or ask a question related to a hMailServer beta release.
Post Reply
User avatar
dzekas
Senior user
Senior user
Posts: 2486
Joined: 2005-10-13 21:28
Location: Lithuania

Message indexing as separate process

Post by dzekas » 2009-11-28 22:59

In 5.3 build 1617 message indexing is part of main hmailserver process. If user has lots of unindexed data and turns on message indexing, indexing overloads CPU and hmailserver services are not usable. Is it possible to run indexing as separate process? On multicore/smp setups separate process would use only one core or CPU and hmailserver might be usable during indexing.

User avatar
martin
Developer
Developer
Posts: 6834
Joined: 2003-11-21 01:09
Location: Sweden
Contact:

Re: Message indexing as separate process

Post by martin » 2009-11-29 11:33

At the same time, the current indexing process within hMailServer is done in a single thread, so I think it will only use one core at most.

Maybe giving the indexing thread a lower priority would also solve the problem. As it stands now, the indexing process has priority equal to other parts of hMailServer, which is a bit stupid.

I'll prepare a mailbox with 500k messages and run some tests on it.

User avatar
dzekas
Senior user
Senior user
Posts: 2486
Joined: 2005-10-13 21:28
Location: Lithuania

Re: Message indexing as separate process

Post by dzekas » 2009-11-29 12:10

martin wrote:At the same time, the current indexing process within hMailServer is done in a single thread, so I think it will only use one core at most.

Maybe giving the indexing thread a lower priority would also solve the problem. As it stands now, the indexing process has priority equal to other parts of hMailServer, which is a bit stupid.

I'll prepare a mailbox with 500k messages and run some tests on it.
In my case less than 100k was enough. One mailbox with 50k, one with 25k, one with 10k, one with 5k and some smaller boxes. Over 2GB of emails, 1,5GB in 50K mailbox. Basic hMailServer setup with MS SQL. database file was about 32 MB. Copied mails and then turned on indexing. CPU usage was maxed at 50% on Core2Duo E5200 setup. Other programs were usable, hMailServer was very slow to respond.

I've reinstalled hMailServer and switched to MySQL in order to separate DB and hMailServer processes. Now I've turned on indexing before copying those test mailboxes. Can you configure hMailServer to use mysql dll from %PATH% instead of using dll stored in hMailServer's bin directory?

User avatar
martin
Developer
Developer
Posts: 6834
Joined: 2003-11-21 01:09
Location: Sweden
Contact:

Re: Message indexing as separate process

Post by martin » 2009-11-29 12:32

Can you configure hMailServer to use mysql dll from %PATH% instead of using dll stored in hMailServer's bin directory?
Hmm, maybe. One potential problem is that it makes the lookup of the DLL a bit too nondeterministic in my opinion. If I used %PATH%, hMailServer could stop working if you uninstalled a 3'rd party program which hMailServer happened to have a dependency on via the %PATH% variable.

(I'm having some hardware trouble at the moment with my development PC so the performance tests with indexing will take a while but will get back within a few days.)

User avatar
martin
Developer
Developer
Posts: 6834
Joined: 2003-11-21 01:09
Location: Sweden
Contact:

Re: Message indexing as separate process

Post by martin » 2009-11-29 21:04

I've think I've found the cause of the problem (the bottleneck is a database statement where indexes aren't used). Will put up a new version beginning of the upcoming week with an added index.

User avatar
martin
Developer
Developer
Posts: 6834
Joined: 2003-11-21 01:09
Location: Sweden
Contact:

Re: Message indexing as separate process

Post by martin » 2009-12-02 19:03

dzekas,

When you used MySQL, did you see the same 50% CPU usage as when you used SQL CE?

The, by far, largest bottleneck in the indexing process when using SQL CE was a "heavy" SQL statement where no indexes were used. On my PC, it took longer time to figure out what messages to index than to actually index them. When using MySQL, better indexes were used and CPU usage was rarely above 5%.

User avatar
dzekas
Senior user
Senior user
Posts: 2486
Joined: 2005-10-13 21:28
Location: Lithuania

Re: Message indexing as separate process

Post by dzekas » 2009-12-02 19:38

martin wrote:dzekas,

When you used MySQL, did you see the same 50% CPU usage as when you used SQL CE?

The, by far, largest bottleneck in the indexing process when using SQL CE was a "heavy" SQL statement where no indexes were used. On my PC, it took longer time to figure out what messages to index than to actually index them. When using MySQL, better indexes were used and CPU usage was rarely above 5%.
I turned on indexing before syncing primary imap server with hMailServer. MySQL daemon process had higher CPU usage than hMailServer and I think mysqld was maxed to 50%. hMailServer added new mails and indexed existing ones at the same time. I still could use mail server and database.

Today I've cleared index cache, turned off indexing and then turned it on again. 3-6% load. Looks like it was MS SQL CE issue. Standalone MySQL server is not affected.

Post Reply