| |
Q: My models that have hex and tet element mixtures in them seem to run
painfully
slow, even when I change all eligible solid95s to solid92s. I checked the
archives and saw John Crawford's message of 1/11/1999 regarding this
matter
("tet 95 elements -- slower that tet 92?").
He found that even for 2 models with the same number of dof, the all 92
job
ran faster that the 95/92 mixture.
Why is this? Shouldn't the job speeds be comparable for the same number
of dof?
A: There's a couple of things I think are going on w/ 95 tets vs. 92 tets.
I think that the extra 10 unused nodes of 95 tets are being taken up in
memory, so this increases RAM (& other) demands. Also, I think that the PCG
solver treats 95 tets and 92 tets differently. I could be imagining it, but
I've noticed that more jobname.PC* files are generated w/ 95s. Of course,
these *.PC* files get cleaned up afterwards, so I can't confirm this, so it
could be a figment of my imagination, as many tend to be.
Instead of boring you w/ my mere speculation (I'd rather bore you w/ my
data), I ran a simple beam meshed w/ 95 tets, fixed at one end and loaded
axially (random/arbitrary case I set up). It has the same number of nodes &
elements (I used TCHG to convert 95 tets to 92 tets). I used DIRECT,OFF to
force jobname.emat and esav files to be generated (use of DIRECT,OFF is
undocumented; also, in general, you never want to use it since direct
assembly is SO much faster -- I just did it to force ANSYS to write emat and
esav files). This is what I got in the Output summary:
SOLID92 TET:
90.250 MB WRITTEN ON ELEMENT MATRIX FILE: tet9295.emat
5.000 MB WRITTEN ON ELEMENT SAVED DATA FILE: tet9295.esav
13.625 MB WRITTEN ON RESULTS FILE: tet9295.rst
SOLID95 TET:
337.375 MB WRITTEN ON ELEMENT MATRIX FILE: tet9295.emat
15.562 MB WRITTEN ON ELEMENT SAVED DATA FILE: tet9295.esav
28.062 MB WRITTEN ON RESULTS FILE: tet9295.rst
Note that the larger EMAT, ESAV, and RST files for 95 tets show that the 10
collapsed nodes per 95 tet is kept, leading to unnecessary CPU time and Disk
I/O.
If you decide to run a similar case, note the following:
- CPU time is more for element preparation for 95 tets
- Solving time is more for 95 tets
- Results calculation (stress recovery) is more for 95 tets
This is what I have seen in general in the past, and this is also confirmed
w/ a simple case I ran. For example, examining the output of solution,
element preparation is more for 95 tets:
SOLID92 TET:
*** ELEMENT MATRIX FORMULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID92 6.259 0.000542
SOLID95 TET:
*** ELEMENT MATRIX FORMULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID95 27.179 0.002354
For element output (stress calculations), the same is true:
SOLID92 TET:
*** ELEMENT RESULT CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID92 5.468 0.000474
*** NODAL LOAD CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID92 1.452 0.000126
SOLID95 TET:
*** ELEMENT RESULT CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID95 18.376 0.001592
*** NODAL LOAD CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID95 3.435 0.000298
Of course, I should run this case many times to get a decent "average"
results, but I ran it on the same hardware, 95's first, then 92's, so I'm
guessing these results shouldn't differ too much.
Another really interesting this is that if you view the PCG solver status
file (jobname.PCS) for both runs, it shows the following:
SOLID92 TET:
Nonzeros in Upper Triangular part of
Global Stiffness Matrix : 2150307
Nonzeros in Preconditioner: 822126
Total Operation Count: 1.02694e+009
Total Iterations In PCG: 81
Average Iterations Per Load Case: 81
Input PCG Error Tolerance: 1e-008
Achieved PCG Error Tolerance: 9.96546e-009
********************************************************
Total PCG Solver CP Time: User: 82 secs: System: 0 secs
********************************************************
SOLID95 TET:
Nonzeros in Upper Triangular part of
Global Stiffness Matrix : 2150307
Nonzeros in Preconditioner: 1838616
Total Operation Count: 1.1597e+009
Total Iterations In PCG: 55
Average Iterations Per Load Case: 55
Input PCG Error Tolerance: 1e-008
Achieved PCG Error Tolerance: 9.7922e-009
***********************************************************
Total PCG Solver CP Time: User: 124.7 secs: System: 0 secs
***********************************************************
I am not well-versed in the intricacies of conjugate gradient solvers, but
the fact that the nonzeros in the preconditioned matrix differs between the
two cases and that the total iterations differ leads me to believe that even
the PCG solver isn't "seeing" the same matrices for 95 tets and 92 tets --
the nonzeros of the [K] matrix are the same, of course, but I think that
ANSYS feeds the PCG solver more zeros in one instance (95 tets) to account
for the extra 10 nodes that are collapsed in degenerate form. Note that
solution time for PCG solver is shorter for 92 tets, too, as I mentioned
above.
So what's the bottom line? Use TCHG whenever possible to convert
degenerate 95 tets to 92 tets for faster solution times. Maybe it would be
great if this was done internally by ANSYS, but I can't really complain
since the PCG solver rocks.
Of course, if one was more bored than I am, one could run cases of mixed
95 hex and pent with 92/95 tets or with other element types, etc., but I am
quite sure one would come to the same conclusions. This behavior of 95 tets
is stuff I've seen "in general" in the past quite often. I believe a
similar situation exists for SOLID186 & 187 tets.
Anyways, I hope that this might've explained *why* 95 tets run 'slower'
than 92 tets and provided some data on this. It's those pesky 50% of nodes
(10 collapsed nodes/element) which like to hang around the stiffness matrix
and take up valuable real estate. It's kinda analogous to the fact that I
can remember every Simpsons episode verbatim but can't recall simple beam
equations w/o looking them up in Roark & Young. I have no idea why my brain
is cluttered with worthless knowledge, and I can't explain why the stiffness
matrix has contributions from collapsed nodes of degenerate elements.
|