Hi,
Cesar Inacio Martins schrieb:
> Hi Richard,
>
> Before, just to explain, this tests is to discover a "good" way to work with
> this kind of the machine and not tunning the "create index" ...
>
> My comments bellow
>
> - I keep this configuration because is my objective to use at production ,
> where have a lot of others tables with 4k pages, because that I keep in 5 GB
> the bufferpool of 8K.
You really should get hold of a load generator and run those very many
queries you have in PROD and which are fast, but many.
I always try to test with queries out of dayly PROD and rank them
creating a list of <avg no of query occurance> multiplied by
<avg duration of query>, sorted in Top Down style (desc).
Because: If some queries, which are short runners (0,05 sec or faster) BUT
can be seen in thousands per minute get slower only a tiny bit, you
have a real huge problem to solve.
>
> - There is no I/O wait , any moment...
One possible reason is, that 16 LRUs is not enough.
And I know that >>500 LRUs is too much as it was already
pointed out. If you have 2 shots, then give 24 LRUs and 40 LRUs a try.
A remote diagnosis is very hard to do is such a case.
> - Solaris, doing cache for KAIO access????? your sure??
No, no cacheing for KIO. But for AIO, and it *might* be that therefore
AIO is faster than KIO, if it reads out of your main memory used as
cache by SOL10. By upping you bufferpool according to needs this effect
goes away.
>
> - I will try test change the Priority to Real Time and see what happen
>
> - I keep the AUTO_* just to see if the IDS will change something. And the
> maximum they did is add a 1 CLEANER (32 Cleaners)
>
> - The SHMADD is 100MB and at any moment they are added.
>
> - The temp dbspaces are used...
this indicates that you do disk sorts, which you better
avoid. Again an indication, that you do not have enough
memory. If runnig with PDQPRIO > 1 you should be able to do
all sorts inside the memory controlled by the memory grant manager.
>
> - The max size of memory used (onstat -) : 24036352 Kbytes
The T5120 has 32GB. I would use as much memory as possible for IDS
which is apparently not the case. It should be possible to use
another 4 GB for IDS.
>
> - I keep the DS_MAX_SCANS = 4 because at my limit of 4 CPU Cores , if I put
> this value to high, working with PDQ + Table Partitions will create to much
> threads and having to much overhead. But I will test grow up this value.
>
> - I'm will study with the storage Admin the possibility to frag in more and
> small peaces this LUNs.
>
> - Unfortunattly , not work with RAID 5 is not a option today.. maybe in
> future, but for hardware limitation, we must keep this....
>
> - I will talk to my boss to buy a laptop like yours to use as our server and
> they use this server to desktop... :)
well, possibly 3500 USD is too expensive, if a second DS8300 is too expensive
also .... ;)
As a matter of fact, buying at sources, which integrate at the
heatbeat of technology and not 2-3 yrs behind the market, can you get 4-10 times
the performance off the same amount of money and same TCO for 36 months,
than buying tech, which - according to SUNs marketing in 2006 - should have
been replaced by Rock technology in Q1/2007. So actually one cannot
expect to get much more now than was normal with the competition of SUN
in Q1/2007, almost 2 years ago.
Just compare SUN Txxxx to supermicro and their 3 channel DDR 1333 mem based
AMD systems. Or even look at Tyan. Or look at X58 chipset based systems having
4x i7-965 CPUs. There you can see mem thruput well over 180 GB / second.
And of course, if you take notice and look after it, you gonna have
4-6 PCIe x16 connectors for you Fusion-IO SSDs. Then you might see
IO performance in the 80GB / second range, sustained random reads of
2KB blocks or sustained 90GB /second of random 4KB reads. To get the
sustained write rate multiply with 0.75!
And I mean it: sustained, not peak. And I can demo this anytime.
Prices? Same as you company did spend now, or maybe 5% less.
And yes, no single PoF, everything hot swappable, and TDP values makes
you believe that you can run this systems in open air in the summer in
Death Valley (THIS was clearly an exaggeration)
>
> - I not monitor in detail how much time for each of create index step. But I
> agree this is very important to know if I want to tunning the I/O .... I will
> spend more time checking this.
>
> So, because I don't see I/O wait, what is a default behave on the kind of
> processos (Niagra) , can I have some issues on the CPU or I/O and this is hide
> behind of this zero I/O wait ?
well yes. But the potential source of problem is a wide range of
misfits and not easy to lay out in detail.
One main problem is hyperthreading. This is why ORACLE suggests to
turn this off on all Txxx and Mxxx machines when using ORACLE DBs.
Just search for ORAs white papers dealing with this.
But this also means that you see that you only have a 16 CPU thread
machine. Then there is CPU layout (1 FPU only per socket) and the
latency when HW threads have to wait for the common L2 cache or
for the memory controller. There is no parallelism in the HW, even
when you run 4 CPUvps going for the buffer pool at the same time, if
those 4 CPUvps happen to be active on the same socket. Their memory access
has to be serialized, which is a real PITA.
As an extra: Read you memory specs and start to cry.
In surroundings when you have very many light weight forks, all this
might be still a good fit. Not so if you have processes trying to avoid
friendly yielding or premature time slice ending by even running
short spin loops to archive exactly this: Stay on the CPU as long as a
software thread in a CPUvp has something to do except I/O.....
What IDS needs is exactly the contrary of the tuning goal of a T5120.
And the contrary of the tuning goal of the memory model of SOL10.
Sorry that I do not have any good news for you.....
Dic_k
P.S.: I am off for weekend now, but gonna check back to this
posting-thread eventually early Monday (TZ is UTC+1)
>
> --- Em sex, 6/11/09, Richard Kofler <richard.kofler@chello.at> escreveu:
>
> De: Richard Kofler <richard.kofler@chello.at>
> Assunto: Re: Performance - Sun SPARC T2 [17982]
> Para: ids@iiug.org
> Data: Sexta-feira, 6 de Novembro de 2009, 16:50
>
> Hi,
>
> maybe I can contribute a bit.
>
> A nasty decision this is, when you must go on with this.
> Good DBAs can do some miracles, but we cannot heal every
> wrong decision.
>
> This type of HW combination is nice for a lab dealing with 500
> records, or if is runs some tomcat containers (not too many!).
>
> It is a perfect misfit for IDS (same goes for Oracle - search for
> ORA whitepapers, how to cope with a challenge like this: they have
> a few ideas, how to survive - somewhat at least)
>
> More text below
>
> Dic_k
>
> Cesar Inacio Martins schrieb:
>> Hi,
>>
>> I already read here some discussions about performance issues with Sun
>> Machines, UltraSPARC T1 and T2.
>> Unfortunately this is my destiny , soon I will migrate from HP-UX to a Sun
>> T5120 SPARC T2 with
>> 4 Cores and 32 Threads (32 GB RAM).
>>
>> I already have the machine available and ready for tests...
>>
>> My initial objective is discovery the best way to work with this processor
>> SPARC T2 and yours threads.
>>
>> For this test I load an old copy of our production database (via HPL),
> loading
>> only the tables without indexes.
>>
>> To assess the overall performance of the machine I choose execute "create
>> index" in two "big" tables.
>> I consider this a good test because need to do a lot of I/O to read/write
> all
>> data , memory usage and CPU in parallel to sort the indexes... if somebody
>> disagree , please tell me why ...
>> Art ???? :)
>>
>> So, after my tests I got better performance when I use more threads and not
>> only the "cores".
>> If there something I forgot to set or consider, please , tell me.
>>
>> During the process, monitoring the machine, I have
>> - For certain indexes, more than 150 threads in the same session...
>> - I/O reads up to 200 MB/sec , writes up to 80 MB/sec
>> - When configured with only 4 CPU threads , the prstat show 100% of CPU
>> utilization and with vmstat I get some little values on the CPU queue (4 - 8
> )
>> - When configured with the 32 Threads, rarely I get 90% of CPU utilization .
>
> What I/O waits do you see (IDS wait reason stats (wts utility) / sar -d /
> iostat)?
>
>> The reason for this , on my opinion , is because I don't used the 32 threads
>> (affinity)...
>>
>> What I don't understand is why the instance appear work better with AIO VPs
>> and not KAIO ... any comments??
>
> yes.
> SOL10 will use every single Byte of memory left over as cache, but see below.
>
> And *do not* ask ME, what I think about SOL10's page coloring algo .....
>
> Try to convince your sysdmin to set IDS processes to real time prio, this
> is the best substitute I was able to find in 4 yrs for what I was used to
> gain from NOAGE.
>
>> ============================================
>> Here is the environment:
>>
>> Machine: Sun T5120 UltraSPARC T2
>> 4 Cores / 32 Threads - 1.2Ghz
>> 32 GB RAM
>> Solaris 10 - installed in the internal disks
>> Database storage: is connected by SAN to IBM DS8300 Storage (raid 5) , 16
>> devices/LUNs of 50GB each.
>>
>> IDS 11.50 FC4TL Enterprise TimeLimited (for evaluate)
>> Here the more important items on the ONCONFIG :
>>
>> * = parameters changed for each test
>> ---------------------------------------------
>> *VPCLASS cpu,num=16,aff=(10-26),noage
>> *VPCLASS aio,num=10,aff=(27-31),noage
>> AUTO_AIOVPS 1
>
> The best you can do with AUTO_* is to avoid it.
> When you understand and see what is going on, and
> When you 'like' what you see (or better when you think 'I can live with
> this'),
> then you can try if AUTO_* comes near that.
> I always fail in step 2.
>
>> SHMVIRTSIZE 10240000
>
> what is your SHMADD?
> what does onstat -g seg show?
> how much % of memory do you use when creating the big index
> (onstat - has the value on the right side)?
> do you see sorts on disk (onstat -D show pg writes on tempdbspaces)?
>
>> SHMTOTAL 30000000
>> AUTO_CKPTS 1
>> DS_MAX_QUERIES 1
>> DS_TOTAL_MEMORY 9216000
>> DS_MAX_SCANS 4
>
> too low, make it equal to your no of LUNs * 2,
> but *after* upping your no of LUNs
>
>> DS_NONPDQ_QUERY_MEM 150000
>
> too high for me -> 50000 is high already, as
> you won't use this memory in your test.
>
>> BUFFERPOOL
>> default,buffers=10000,lrus=8,lru_min_dirty=50.000000,lru_max_dirty=60.500000
>> BUFFERPOOL
>> size=2K,buffers=50000,lrus=8,lru_min_dirty=50.000000,lru_max_dirty=60.000000
>> AUTO_LRU_TUNING 1
>> BUFFERPOOL
>>
>
> size=4K,buffers=1250000,lrus=16,lru_min_dirty=50.000000,lru_max_dirty=60.500000
>> BUFFERPOOL
>>
> size=8K,buffers=625000,lrus=16,lru_min_dirty=50.000000,lru_max_dirty=60.500000
>> BUFFERPOOL
>>
> size=16K,buffers=187000,lrus=8,lru_min_dirty=50.000000,lru_max_dirty=60.500000
>> ---------------------------------------------
>>
>> The dbspaces of the instance are spread over the 16 LUNs on
>> the DS8300 accessing directly the device (RAW - /dev/rdsk)
>> Have dbspaces with page size of 4k,8k and 16k.
>>
>> The 2 Tables used for tests:
>> ================================
>> TABLE_A - A historical table
>> ================================
>> *data* size after load with HPL = 61.6 GB
>> Partitioned in 6 fragments by expression (date field filtering by semester)
>> Each fragment is in differ and exclusive dbspace (inhered from the old
>> structure)
>> dbspaces Page size = 16k
>>
>> Have 3 indexes , with the same expression partition , 6 fragments.
>> Each index have your own and exclusive dbspace.
>> dbspaces Page size = 8k
>> index Tab_A_ix_A - size 2.4 GB when created
>> index Tab_A_ix_B - size 2.4 GB when created
>> index Tab_A_ix_C - size 2.4 GB when created
>>
>> ================================
>> TABLE_B - Other historical table
>> ================================
>> *data* size after load with HPL = 81.3 GB
>> Partitioned in 21 fragments by expression (date field filtering by semester)
>> Each fragment is in differ and exclusive dbspace (inhered from the old
>> structure)
>> dbspaces Page size = 16k
>>
>> Have 3 indexes , with the same expression partition , so 21 fragments.
>> Each index have your own and exclusive dbspace.
>> dbspaces Page size = 8k
>> index Tab_B_ix_A - size 7.3 GB when created
>> index Tab_B_ix_B - size 7.6 GB when created
>> index Tab_B_ix_C - size 13.7 GB when created
>>
>> Summary of Informix configuration:
>> BufferPool - 2k=100MB , 4k=5Gb , 8k=5GB , 16k=2.9GB
>
> --> 8KBbufferpool size 5GB, largest idx in this pool 13.7 GB
>
> if you increase the 8KBbufferpool such, that you can cache more % of your
> index
> and if you tune PDQ memory just so small, that you *do not* sort on disk (==
> no pg writes
> seen on temp dbspace(es)) you will no longer see the cacheing effects in the
> excess (== 'outside' IDS) memory --> AIO will no longer be of advantage.
> If AIO is faster than KIO - even when you can use mountoptions like
> forcedirectio - more often than not this indicates that bufferpool is sized
> too small.
>
> What I have to remark here with *sark on*
> Even on a machine, which definely has a low memory thruput, cacheing
> I/O instead of waiting *yawn* for I/O to a DS8300 is a good thing.
> *sark off*
>
> [sadly there is no OFF switch for a certain amount of backgroud sarkasm here]
>
> 16 LUNs into a DS8300 is not good, I recommend 64 LUNs at least, so you
> can queue up more unsatisfied I/Os in parallel.
> The microconfiguration on a DS8300 IS very important: You will want to
> have long thin type striping, and *of course* NO RAID 5 or 6!
>
> Who wanted to save 3 bucks on this cheap type of unit by reducing 'slack
>
> disk space'?
>
> Try to get the one responsible to be fired or at least
>
> work for the competition with full salary from your company.
>
> But as your timing values show, you are going down the nasty road.
>
> On the laptop I write this, I can create a index of 20GB and
> 35 Bytes entry size off a 100GB table with max rsize 410 and avg rsize 375
> in less than 7 mins.
> Now this laptop IS a good machine (8GB, 256 GB SSD, 300 GB 7200 rpm disk)
> but everyone and her sister can mail order it from alienware.com
>
>> SharedMemory - 10GB
>> DS_TOTAL_MEMORY - 9GB
>>
>> ===============================================
>> ===============================================
>> ===============================================
>> To execute this tests, I execute the create index sequentially,
>> with a checkpoint between them using PDQ = 100 and measuring the time
>> with the command "time" on the dbaccess execution
>> (Hardware Threads ID = 0 to 31)
>>
>> There is the tests results:
>>
>> - TEST 1
>> Machine with 32 Threads ONLINE
>> 8 CPU VPs without configuring AFFINITY
>> Using KAIO - 4 AIO VPs
>>
>> TABLE_A - Tab_A_ix_A - 12m5s
>> TABLE_A - Tab_A_ix_B - 11m40s
>> TABLE_A - Tab_A_ix_C - 11m46s
>> TABLE_B - Tab_B_ix_A - 50m33s
>> TABLE_B - Tab_B_ix_B - 50m11s
>> TABLE_B - Tab_B_ix_C - 90m24s
>
> How long does it take to scan table B?
> Try to find out where you lose the time.
> Even your HW should be able to read a table
> of 80GB in approx 20 minutes (I have seen more than 4GB per min
> on a DS8300). It takes some time to write 7.3GB (the 2 smaller idxes)
> but it should not be slower than writing 2GB / minute.
> Where do you spend the rest? I have no explanation for 16 mins:
> read table: 20 mins or better
> write idx: 14 mins or better
> sort: I can see no way to make it so slow, that it consumes 16 mins.
> NO, a T5120 is not *that* slow ;)
> But your total time is 50 mins. IFF you do not see excessive IO-waits.
> If you see high IO waits (more than 20%) then your sysadmins must start to
> test
> using bonnie++ or somesuch and reconfigure the no of LUNs, striping and
> cache params of the DS8300. Or find the misconfig on the switch you might
> have between the T5120 and the DS8300, or stop that other testing
> using your bandwith on the way to DS8300, or stop doing test with backing
> up the DS8300 to tape while you crate your indexes.
>
>> - TEST 2
>> Machine with 32 Threads ONLINE
>> 3 CPU VPs with AFFINITY 10,16,24
>> Using KAIO - 4 AIO VPs
>>
>> TABLE_A - Tab_A_ix_A - 13m5s
>> TABLE_A - Tab_A_ix_B - 13m56s
>> TABLE_A - Tab_A_ix_C - 16m12s
>> TABLE_B - Tab_B_ix_A - 100m12s
>> TABLE_B - Tab_B_ix_B - 96m17s
>> TABLE_B - Tab_B_ix_C - 153m37s
>>
>> - TEST 3
>> Machine with 32 Threads ONLINE
>> 20 CPU VPs with AFFINITY 10-30
>> Using KAIO - 4 AIO VPs
>>
>> TABLE_A - Tab_A_ix_A - 12m10s
>> TABLE_A - Tab_A_ix_B - 11m38s
>> TABLE_A - Tab_A_ix_C - 11m56s
>> TABLE_B - Tab_B_ix_A - 46m47s
>> TABLE_B - Tab_B_ix_B - 47m3s
>> TABLE_B - Tab_B_ix_C - 90m51s
>>
>> - TEST 4
>> Machine with 4 Threads ONLINE (0,8,16,24) - All other threads are hard
>> disabled on the ILOM (console)
>> 4 CPU VPs with AFFINITY 0,8,16,24
>> Using KAIO - 4 AIO VPs
>>
>> TABLE_A - Tab_A_ix_A - 13m2s
>> TABLE_A - Tab_A_ix_B - 14m11s
>> TABLE_A - Tab_A_ix_C - 15m17s
>> TABLE_B - Tab_B_ix_A - 79m51s
>> TABLE_B - Tab_B_ix_B - 77m27s
>> TABLE_B - Tab_B_ix_C - 129m24s
>>
>> - TEST 5
>> Machine with 4 Threads ONLINE (0,8,16,24) - All other threads are hard
>> disabled on the ILOM (console)
>> Threads 16 and 24 configured to no-intr (no interruptable, don't do I/O),
>> psradm used to do this.
>> 4 CPU VPs without AFFINITY
>> Using KAIO - 4 AIO VPs
>>
>> TABLE_A - Tab_A_ix_A - 15m2s
>> TABLE_A - Tab_A_ix_B - 12m57s
>> TABLE_A - Tab_A_ix_C - 13m18s
>> TABLE_B - Tab_B_ix_A - 81m4s
>> TABLE_B - Tab_B_ix_B - 76m55s
>> TABLE_B - Tab_B_ix_C - 123m2s
>>
>> - TEST 6
>> Machine with 32 Threads ONLINE
>> 11 CPU VPs with AFFINITY10-20
>> NOT Using KAIO (KAIOOFF=1)- 10 AIO VPs with AFFINITY21-30
>>
>> TABLE_A - Tab_A_ix_A - 11m19s
>> TABLE_A - Tab_A_ix_B - 11m10s
>> TABLE_A - Tab_A_ix_C - 11m47s
>> TABLE_B - Tab_B_ix_A - 50m38s
>> TABLE_B - Tab_B_ix_B - 49m55s
>> TABLE_B - Tab_B_ix_C - 91m53s
>>
>> - TEST 7
>> Machine with 32 Threads ONLINE
>> Threads between 12 - 20 configured to no-intr (I changed this after oninit
> has
>> been initialized to keep the affinity)
>> 11 CPU VPs with AFFINITY10-20
>> NOT Using KAIO - 10 AIO VPs with AFFINITY 21-30
>>
>> TABLE_A - Tab_A_ix_A - 11m26s
>> TABLE_A - Tab_A_ix_B - 11m27s
>> TABLE_A - Tab_A_ix_C - 11m17s
>> TABLE_B - Tab_B_ix_A - 49m49s
>> TABLE_B - Tab_B_ix_B - 51m6s
>> TABLE_B - Tab_B_ix_C - 91m37s
>>
>> - TEST 8
>> Machine with 32 Threads ONLINE
>> Threads between 12 - 25 configured to no-intr (I changed this after oninit
> has
>> been initialized to keep the affinity)
>> 16 CPU VPs with AFFINITY 10-26
>> NOT Using KAIO - 10 AIO VPs with AFFINITY 27-31 (2 AIO VPs per CPU)
>>
>> TABLE_A - Tab_A_ix_A - 10m56s
>> TABLE_A - Tab_A_ix_B - 11m10s
>> TABLE_A - Tab_A_ix_C - 11m
>> TABLE_B - Tab_B_ix_A - 46m35s
>> TABLE_B - Tab_B_ix_B - 41m41s
>> TABLE_B - Tab_B_ix_C - 87m59s
>>
>> - TEST 9
>> Machine with 32 Threads ONLINE
>> 16 CPU VPs with AFFINITY 10-26
>> NOT Using KAIO - 10 AIO VPs with AFFINITY 27-31 (2 AIO VPs per CPU)
>>
>> TABLE_A - Tab_A_ix_A - 10m27s
>> TABLE_A - Tab_A_ix_B - 11m17s
>> TABLE_A - Tab_A_ix_C - 10m48s
>> TABLE_B - Tab_B_ix_A - 46m32s
>> TABLE_B - Tab_B_ix_B - 47m5s
>> TABLE_B - Tab_B_ix_C - 90m4s
>>
>>
>>
>
> ____________________________________________________________________________________
>> Veja quais são os assuntos do momento no Yahoo! +Buscados
>> http://br.maisbuscados.yahoo.com
>>
>>
>>
>
> *******************************************************************************
>> Forum Note: Use "Reply" to post a response in the discussion forum.
>>
>>
>
--
Richard Kofler
SOLID STATE EDV
Dienstleistungen GmbH
Vienna/Austria/Europe