-
PERFORMANCE
MATTERS
(joint work with Charlie Curtsinger, Grinnell College)
emeryberger.com, @emeryberger
Emery Berger
College of Information
and Computer Sciences
UMASS AMHERSTA short time ago
:
un.bmp
Ogle is too slow!
OGLE’84 is too slow!Transistors (millions)
Clock Speed (MHz)
Performance used to be easy
0.001
0.01
0.1
1
10
100
1,000
10,000
1970
1975
1980
1985
1990
1995 gle
loading…
No mojitos for me…
Back to the present…Transistors (millions)
Clock Speed (MHz)
Performance not easy anymore
0.001
0.01
0.1
1
10
100
1,000
10,000
1970
1975
1980
1985
1990
1995
0 码力 |
197 页 |
11.90 MB
| 5 月前 3
-
Being Friendly to Your
Hardware
Performance Engineering
A gentle introduction to hardware for software engineers
2Where does C++ run?
3On an abstract C++ machine
4On an abstract C++ machine?
In most practical cases at boot time only
Same capacity, different
composition => different
performance profile
From JESD 79-4 DDR4 specificationMemory
• Memory system is in the uncore
• Cores act Multiple instructions resulting in fewer
operations
• ISA restrictions may have impact to
performance
Imaginary ARM
mov r20, 0x123456789abcdef0Register renaming
52
Branching
Fetch
Decode
Queue
0 码力 |
111 页 |
2.23 MB
| 5 月前 3
-
Poster submission: Modern C++ for Parallelism in High
Performance Computing
Victor Eijkhout
CppCon 2024
Introduction
This poster reports on ‘D2D’, a benchmark that explores elegance of expression and context of a High Performance Computing ‘mini-application’. The same code has
been implemented using a number of different approaches to parallelism. Implementations are
discussed with performance results.
Relevance multi-dimensional arrays through ‘mdspan’, it is interesting to explore
what C++ can offer for lower level performance critical operations.
Scientific computing is an interesting test cases since many algorithms are
0 码力 |
3 页 |
91.16 KB
| 5 月前 3
-
Introduction
Firsts steps
Context
Theoretical foundations
Outline of an implementation
Conclusion
High-Performance Numerical Integration in the Age of C++26
Vincent Reverdy
Laboratoire d’Annecy de Physique des past, other languages do far better in terms of everything: functionality, ease of use,
and even performance
This talk
The goal is NOT to revolutionize everything or show a library that beats everything yn+1 = yn + h
s
�
i=1
biki
ki = f
�
tn + cih, yn + h
s
�
j=1
aijkj
�
,
i = 1, . . . , s
Performance concerns
The Butcher Tableau can be very sparse
Null coefficients should be optimized away
Compilers
0 码力 |
57 页 |
4.14 MB
| 5 月前 3
-
Nim - the first high performance Nim - the first high performance language with full support for hot code- language with full support for hot code- reloading at runtime reloading at runtime by Viktor