HP Workstation Z8 G4 AI Performance Evaluation

Don’t be afraid of the scorching sun, the challenge of HP and 66 high-temperature operation |

Safeguarding AI empowers HP to create smart transportation solutions with Sema Intelligence (Part 1)

Extreme Road Book: Bring Hewlett-Packard War 66, Trek 2000 kilometers of famous winter mountains

(News from ChinaIT.com) Since HP established its first technical service office in China in 1982 and became the world’s first manufacturer to provide professional IT services in China, HP has always been committed to bringing the best technical services to Chinese users. , And continue to expand the depth and breadth of services. HP workstations have been serving customers at home and abroad for many years, and are deeply trusted by customers and affirmed by the market.

With the continuous development of artificial intelligence AI, the AI attributes of workstations have become an important indicator. The Z8 G4 belongs to HP’s high-end workstations. It can be adapted to two NVIDIA RTX A6000 graphics cards or three NVIDIA RTX A5000 graphics cards, or four NVIDIA RTX A4000 graphics cards. Workstations equipped with professional GPUs also have a lot of application space in the artificial intelligence industry. At present, the most widely used AI application is visual recognition. This type of application is widely used in product automatic detection,Behavior recognition,Face recognition,Vending and other fields.This test is aimed at the training performance and inference performance of the Z8 G4 model in the field of pattern recognition and classification.

About HP Workstation

Internal structure

The inner cover divides the internal CPU and GPU into two areas. This design not only facilitates the heat dissipation of the air duct, but also plays a role in sound insulation and protects the core components. 4 independent hard disk bays can support large-capacity hard disks, which is very convenient to maintain.

processor:
Supports dual Intel Xeon Scalable processors, up to dual Intel Xeon Platinum 8260L processors

RAM:
24 memory slots, use Registered DIMM to support up to 3T

Expansion slot:
3 x PCIe Gen 3 x4, 4 x PCIe Gen 3 x16, can support up to 4 graphics cards.

hard disk:
Supports SATA hard disk drives (HDD) up to (5) 8.89 cm (3.5 inches) 7200 rpm SATA hard drives, 500GB, 1.0, 2.0, 4.0, 8.0TB maximum capacity total 40TB; SATA solid state drive (SSD); PCIe solid state drive (SSD) ); M.2 SSD supports 4

front

2 USB 3.1 Gen 1 Type-A ports (the leftmost port has charging function);

2 USB 3.1 Gen 2 Type-C ports;

1 combo headphone jack;

1 optional media card reader;

1 ultra-thin DVD ROM, realize convenient external expansion function.

side

The side panel has a tool-free disassembly function. With a lock design, the maintenance personnel can easily remove the side panel. The two area covers of the CPU and PCIE can be disassembled and assembled separately, and they are also tool-free. The 4 hard disk slots can be disassembled and assembled separately, which is very convenient.

Behind

6 USB 3.1 Gen 1 (aka USB 3.0), 1 serial port;

1 PS/2 keyboard;

1 PS/2 mouse;

2 RJ-45;

For integrated Gigabit LAN, 1 audio input port (can be reset to microphone port);

1 audio line output port;

1 power interface, (optional 1700W high-efficiency power supply)

<<< Swipe left and right to see more >>>

In addition to the above powerful hardware conditions, HP workstations are also very powerful in the test configuration and software list.

About this test

Ampere Architecture professional card, compared with the previous generationTuringThe performance of the architecture has been greatly improved. The 8nm process makesGPUIntegrate more computing units. Stronger computing power greatly improves the performance of deep learning.

AmpereArchitecture high-end graphics cards with second-generationRT CoreAnd the third generationTensor Core.RT CoreThe main purpose is to accelerate ray tracing rendering,Tensor CoreThe main purpose of is to accelerate deep learning training and reasoning.

and,AmpereArchitecture pioneering supportTF32Operation, you can use Tensor CoreSpeed up training, in some scenarios, useTF32The model can be trained faster.

According to the above powerful hardware conditions, the staff conducted three tests on the machine, the specific tests are as follows:

Basic test

1. GPU-burn stability test

GPU-burn is a baking machine software for graphics cards. The software calls CUDA for calculations for a long time, and the GPU utilization rate has always been close to 100% during the test. So it has been used by many professional users to test the stability of GPU under Linux.

Picture: GPU-burn stability test screenshot

Test Results:

The ambient temperature of the test is 24℃, and the test uses two RTX A6000 graphics cards with a maximum power consumption of 300W. In this environment, running at full power for 24 hours, the temperature of the graphics card has been kept below 85°C, and the operation is very stable.

2. CUDA-Z computing performance

CUDA-Z is the Z series software, similar to CPU-Z and GPU-Z. CUDA-Z shows some basic information about GPUs and GPUs that support CUDA. Through this software, the computing performance of the GPU can be detected, such as integer, single-precision floating-point, double-precision floating-point operations and so on.

Test Results:

Judging from the test results, the performance of the GPU increases with the increase of the model. The single-precision floating-point arithmetic performance of RTX A5000 is 1.7 times that of RTX A4000, and the floating-point arithmetic performance of RTX A6000 is 2.2 times that of RTX A4000. From the performance comparison point of view, Z8 G4 with RTX A6000 will achieve the best training and inference performance. For deep learning, most of them use single-precision or half-precision for training, and do not use double-precision for training, so GPU double-precision performance will not affect the efficiency of deep learning.

Inference performance

The test uses NVIDIA NGC Container Tensorflow 21.07-tf1-py3, and the main training models tested are Alexnet, Resnet50, Vgg, Inception_v3, Inception_v4. These models are all CNN models based on the TensorFlow framework. In this test, in order to reflect the best processing capability of the workstation, a unified Batchsize was not used, and the test results with the best performance were selected after multiple attempts of different Batchsizes. The test uses single GPU, dual GPU, single precision, and half precision for training.

Swipe left and right to see more

This test is based on the TensorFlow framework, using different models to compare the speed of image training and processing on workstations with different GPUs. It can be seen from the statistical histogram that the performance of the same GPU model in the lower half of the precision is much higher than that of the single precision. The performance of dual GPU is up to 90% higher than that of single GPU. The HP workstation Z8 G4 with NVIDIA GPU can process pictures very fast, which is very suitable for medium and lightweight model training such as picture recognition and classification.

Training test

For deep learning, training is to get a good model, and the measurement index is accuracy. Reasoning is different. It does not have the reverse iterative process in training, it is to make predictions for new data, and the AI services we use in daily life are all reasoning services.

Reasoning is more concerned about high throughput, low response time, low resource consumption, and simple deployment process, and TensorRT is a deployment-level solution used to solve the challenges and impacts of reasoning. Below we will use the deep learning framework supported by TensorRT for performance testing.

In many scenarios, reasoning does not require high accuracy to achieve good results, so the reasoning performance of INT8 is increased in the reasoning test.

Swipe left and right to see more

Inference performance tests are all based on 1 GPU. In the case of multiple GPUs, the inference performance will increase exponentially with the number of GPUs. From the test results, the lower the accuracy, the better the performance. With the improvement of GPU models, the performance of inference will also increase. With the cooperation of different NVIDIA Ampere architecture GPUs, the Z8 G4 has very good inference processing capabilities and is very suitable for front-end inference applications.

After basic tests, training tests, and reasoning tests, HP Z8 G4 completed the task excellently and received great praise from testers:

The first impression of HP’s design is tough, steady, and atmospheric. Various slots on the front panel are rich in configuration, which is very beneficial for peripheral expansion. In AI training and testing, these interfaces can be used to expand various external resources, such as cameras, external storage, and so on. Moreover, the chassis is designed without tools, which is convenient for the staff to install replacement parts. The material used is thick, which helps to protect the core components inside the chassis.

HP Z Cooler silent cooling solution, the air duct design in several areas inside the cabinet is reasonable. During training, the heat generated by the GPU is dissipated from the rear through the air duct, and there is no internal thermal cycle, which is very conducive to the stability of the workstation.

Large-capacity memory of up to 3T can use a larger Batchsize during training and inference to improve the processing efficiency of AI.

The hard disk storage can be flexibly expanded and supports 40T local storage. A large amount of data resources are usually used during training. The huge storage is very helpful for the call of massive data.

Through the test, it can be seen that HP’s Z8 G4 workstation with the Ampere-based NVIDIA RTX A4000A5000A6000 GPU can adapt to various training and inference needs, and the application scenarios are very wide. Specifically include the following industries:

manufacturing:Industrial product defect detection, product attribute recognition, autonomous driving, new material research and development, etc.;

product testing

Autopilot

Entertainment industry:Intelligent virtual digital person, text translation, video/image editing, AI stylization, game AI development, etc.;

AI frame supplement processing

AI stylization

Medical industry:Medical research and development, intelligent recognition of medical images, etc.;

Intelligent image recognition

Security industry: face recognition, fingerprint recognition, behavior recognition, language recognition, etc.;

Face recognition

Behavior recognition

The HP Z8 G4 workstation has been used in the above fields, and the work efficiency has been significantly improved, and it is in line with the high performance and long-term stability characteristics of AI training and inference requirements.

As a well-known brand in the industry, HP Workstation has many ISV certifications, and also has a good quality assurance and after-sales service system. It can provide users with 7×24 hours of good technical support and is a close partner for your work.

The HP Z8 G4 workstation is extraordinary.

Scan the “QR code” below
Become an HP Z Club member
Learn more, more surprises and benefits are waiting for you