tips and tricks
quick file search:
    home » tips & tricks » tips from xansys
 
 
SOLID92 vs. Degenerate SOLID95
  Q: My models that have hex and tet element mixtures in them seem to run painfully slow, even when I change all eligible solid95s to solid92s. I checked the archives and saw John Crawford's message of 1/11/1999 regarding this matter ("tet 95 elements -- slower that tet 92?"). He found that even for 2 models with the same number of dof, the all 92 job ran faster that the 95/92 mixture.

Why is this? Shouldn't the job speeds be comparable for the same number of dof?

A: There's a couple of things I think are going on w/ 95 tets vs. 92 tets. I think that the extra 10 unused nodes of 95 tets are being taken up in memory, so this increases RAM (& other) demands. Also, I think that the PCG solver treats 95 tets and 92 tets differently. I could be imagining it, but I've noticed that more jobname.PC* files are generated w/ 95s. Of course, these *.PC* files get cleaned up afterwards, so I can't confirm this, so it could be a figment of my imagination, as many tend to be.

Instead of boring you w/ my mere speculation (I'd rather bore you w/ my data), I ran a simple beam meshed w/ 95 tets, fixed at one end and loaded axially (random/arbitrary case I set up). It has the same number of nodes & elements (I used TCHG to convert 95 tets to 92 tets). I used DIRECT,OFF to force jobname.emat and esav files to be generated (use of DIRECT,OFF is undocumented; also, in general, you never want to use it since direct assembly is SO much faster -- I just did it to force ANSYS to write emat and esav files). This is what I got in the Output summary:

SOLID92 TET:
90.250 MB WRITTEN ON ELEMENT MATRIX FILE: tet9295.emat
5.000 MB WRITTEN ON ELEMENT SAVED DATA FILE: tet9295.esav
13.625 MB WRITTEN ON RESULTS FILE: tet9295.rst

SOLID95 TET:
337.375 MB WRITTEN ON ELEMENT MATRIX FILE: tet9295.emat
15.562 MB WRITTEN ON ELEMENT SAVED DATA FILE: tet9295.esav
28.062 MB WRITTEN ON RESULTS FILE: tet9295.rst

Note that the larger EMAT, ESAV, and RST files for 95 tets show that the 10 collapsed nodes per 95 tet is kept, leading to unnecessary CPU time and Disk I/O.

If you decide to run a similar case, note the following:
  1. CPU time is more for element preparation for 95 tets
  2. Solving time is more for 95 tets
  3. Results calculation (stress recovery) is more for 95 tets
This is what I have seen in general in the past, and this is also confirmed w/ a simple case I ran. For example, examining the output of solution, element preparation is more for 95 tets:

SOLID92 TET:
*** ELEMENT MATRIX FORMULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID92 6.259 0.000542

SOLID95 TET:
*** ELEMENT MATRIX FORMULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID95 27.179 0.002354

For element output (stress calculations), the same is true:

SOLID92 TET:
*** ELEMENT RESULT CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID92 5.468 0.000474

*** NODAL LOAD CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID92 1.452 0.000126

SOLID95 TET:
*** ELEMENT RESULT CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID95 18.376 0.001592

*** NODAL LOAD CALCULATION TIMES
TYPE NUMBER ENAME TOTAL CP AVE CP
1 11546 SOLID95 3.435 0.000298

Of course, I should run this case many times to get a decent "average" results, but I ran it on the same hardware, 95's first, then 92's, so I'm guessing these results shouldn't differ too much.

Another really interesting this is that if you view the PCG solver status file (jobname.PCS) for both runs, it shows the following:

SOLID92 TET:
Nonzeros in Upper Triangular part of
Global Stiffness Matrix : 2150307
Nonzeros in Preconditioner: 822126

Total Operation Count: 1.02694e+009
Total Iterations In PCG: 81
Average Iterations Per Load Case: 81
Input PCG Error Tolerance: 1e-008
Achieved PCG Error Tolerance: 9.96546e-009
********************************************************
Total PCG Solver CP Time: User: 82 secs: System: 0 secs
********************************************************

SOLID95 TET:
Nonzeros in Upper Triangular part of
Global Stiffness Matrix : 2150307
Nonzeros in Preconditioner: 1838616

Total Operation Count: 1.1597e+009
Total Iterations In PCG: 55
Average Iterations Per Load Case: 55
Input PCG Error Tolerance: 1e-008
Achieved PCG Error Tolerance: 9.7922e-009
***********************************************************
Total PCG Solver CP Time: User: 124.7 secs: System: 0 secs
***********************************************************

I am not well-versed in the intricacies of conjugate gradient solvers, but the fact that the nonzeros in the preconditioned matrix differs between the two cases and that the total iterations differ leads me to believe that even the PCG solver isn't "seeing" the same matrices for 95 tets and 92 tets -- the nonzeros of the [K] matrix are the same, of course, but I think that ANSYS feeds the PCG solver more zeros in one instance (95 tets) to account for the extra 10 nodes that are collapsed in degenerate form. Note that solution time for PCG solver is shorter for 92 tets, too, as I mentioned above.

So what's the bottom line? Use TCHG whenever possible to convert degenerate 95 tets to 92 tets for faster solution times. Maybe it would be great if this was done internally by ANSYS, but I can't really complain since the PCG solver rocks.

Of course, if one was more bored than I am, one could run cases of mixed 95 hex and pent with 92/95 tets or with other element types, etc., but I am quite sure one would come to the same conclusions. This behavior of 95 tets is stuff I've seen "in general" in the past quite often. I believe a similar situation exists for SOLID186 & 187 tets.

Anyways, I hope that this might've explained *why* 95 tets run 'slower' than 92 tets and provided some data on this. It's those pesky 50% of nodes (10 collapsed nodes/element) which like to hang around the stiffness matrix and take up valuable real estate. It's kinda analogous to the fact that I can remember every Simpsons episode verbatim but can't recall simple beam equations w/o looking them up in Roark & Young. I have no idea why my brain is cluttered with worthless knowledge, and I can't explain why the stiffness matrix has contributions from collapsed nodes of degenerate elements.
  Posted by Sheldon Imaoka (CSI) on 05.16.2000