Best (possible) configurations for import of planet-file with Osm2pgsq

Hi,

After many hours (some days) of working my import of the plenet-file to an posgres-db failed again using Osm2pgsl. Some things are not clear for me.

This about my configuration:

  • 64-Bit Ubuntu server 10.4,
  • 2 processors,
  • 5 GB Ram,
  • 100GB Hard-Disk
  • it’s a virtual machine at vmware-2-server; the host computer is nearly bored, and gives all power to the machine.
  • I followed this instruction, also the “tuning” hints described at http://weait.com/content/build-your-own-openstreetmap-server. It worked with smaller imports fine. But I have trouble with the planet-file.

1. Disk-size?
My import failed with an 100gb disk with an error message that indicate insufficant disk space. So what is the at least necessary disk size? The only thing I know is that 100gb are not enough.

2. Slim-mode
I used the slim mode, because all other attempts failed. This is in contradiction to the article http://wiki.openstreetmap.org/wiki/Osm2pgsql which sasys that 64-Bit systems are not affected. It seems to me that this statement is not true for all configurations.

3. Cache-Size and extensive swapping
From the article above I took the cache parameter for the slim mode. I set it up to -C 4048 (the computer had 5GB-Ram). My hope was to improve the performance by this a bit. But what I’have seen during the process was that the swap file was used. Not from the beginning but after some hours. Always 70-85% of the 4GB-Swap-partition were used all the time. For me that indicates that my cache parameter is maybe to high… But I’m not sure. So what would you suggest to be the best Cache-parameter for that 5-Ram-GB-System?

4. Is osmosis faster?
I didn’t tried out osmosis until now. Maybe it’s faster. What is your oppinion/experience?

Sorry… long text, many questions… But help would be very nice.
Bye

  1. The statement is true, because the full planet import needs more memory than can be addressed with a 32-bits system. However, since the full import uses more than 12GB of memory, you’re not going to cut it with your 5GB of RAM. Hence: slim mode required. See this wiki article for a graph of memory usage, taken last year. It’ll only consume more resources today.

  2. You have to realize that that osm2pgsql cache is just that: exclusive to osm2pgsql. Postgresql also needs memory (lots of it if you don’t want it to slow to a trickle), as does the rest of your system. I’d go with -C 2048, and see how it runs with that.

  3. Doing simple schema based imports with osmosis, at first, it might appear to be much faster, but that’s only because by default, it doesn’t build geometries. It all depends on what you want to do with the data.

Europe only in slim-mode weight 120GB of disk currently on my server.

I don’t know anyone not using slim mode for the whole planet, I haven’t any figures, but I suspect you will need something like 256GB of main memory, but I’m unable to get access to such a costy hardware conf.

You’ll have to play with it to find out unfortunetly. As an hint, 400Mo cache was too much on a 2GB system to import europe.osm, wich is a way to say that 2GB is not enough anymore to import europe in a human time (It took me 8 days)

Depends on what you want to do in the end

Have you tried that yourself or is it a supposition ?
That’s what says the wiki, but I’m completly unsure about this figure beeing accurate for today planet

By the way, if you are interested in osm2pgsql benchmarks, you can read mine here :
http://lists.openstreetmap.org/pipermail/talk-fr/2010-May/021293.html

(it’s in french, sorry, but google should be able to translate this to you)

Because the current planet file is larger than it was during the time that graph on the wiki was created?

If so, I don’t see what’s wrong with my statement.

Nothing wrong at all, hey ! I’m not suspecting that you are a liar !

I was just kindly (wasn’t I ? then I’m really sorry ) asking if you had more fresher than what the wiki says, so that I can update it to warn users that they’ll need much more main memory now.

Of course your sentence “more than 12 GB” is perfecty correct, I was just wondering if you had any more precise clues… that’s all.

I wasn’t. :slight_smile:

No, I don’t have more recent information, nor have I the computer to test this on. The 12GB number is, even these days, usually enough to convince people that, yes, they really need to use slim mode.

Hi,

Another bench mark data-point.

I imported europe.osm.bz2 downloaded from -eh- geofabrik.de a few days ago. I used the bounding box to get northern Europe, from about the netherlands to the south of sweden. The full command was:

time osm2pgsql -s -C 3000 -m -b 3,50,14,58 europe.osm.bz2

real 455m34.214s
user 80m55.420s
sys 15m29.580s

My computer is a 6 GB core i7 920 running Gentoo Linux. The first try was without the slim mode, that got killed by the system after a few hours.

-peter