forked from open-mpi/hwloc
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathNEWS
2173 lines (2008 loc) · 103 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Copyright © 2009 CNRS
Copyright © 2009-2024 Inria. All rights reserved.
Copyright © 2009-2013 Université Bordeaux
Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
Copyright © 2020 Hewlett Packard Enterprise. All rights reserved.
$COPYRIGHT$
Additional copyrights may follow
$HEADER$
===========================================================================
This file contains the main features as well as overviews of specific
bug fixes (and other actions) for each version of hwloc since version
0.9.
Version 2.11.1
--------------
* Fix bash completions, thanks Tavis Rudd.
Version 2.11.0
--------------
* API
+ Add HWLOC_MEMBIND_WEIGHTED_INTERLEAVE memory binding policy on
Linux 6.9+. Thanks to Honggyu Kim for the patch.
- weighted_interleave_membind is added to membind support bits.
- The "weighted" policy is added to the hwloc-bind tool.
+ Add hwloc_obj_set_subtype(). Thanks to Hadrien Grasland for the report.
* GPU support
+ Don't hide the GPU NUMA node on NVIDIA Grace Hopper.
+ Get Intel GPU OpenCL device locality.
+ Add bandwidths between subdevices in the LevelZero XeLinkBandwidth
matrix.
+ Fix PCI Gen4+ link speed of NVIDIA GPU obtained from NVML,
thanks to Akram Sbaih for the report.
* Windows support
+ Fix Windows support when UNICODE is enabled, several hwloc features
were missing, thanks to Martin for the report.
+ Fix the enabling of CUDA in Windows CMake build,
Thanks to Moritz Kreutzer for the patch.
+ Fix CUDA/OpenCL test source path in Windows CMake.
* Tools
+ Option --best-memattr may now return multiple nodes. Additional
configuration flags may be given to tweak its behavior.
+ hwloc-info has a new --get-attr option to get a single attribute.
+ hwloc-info now supports "levels", "support" and "topology"
special keywords for backward compatibility for hwloc 3.0.
+ The --taskset command-line option is superseded by the new
--cpuset-output-format which also allows to export as list.
+ hwloc-calc may now import bitmasks described as a list of bits
with the new "--cpuset-input-format list".
* Misc
+ The MemoryTiersNr info attribute in the root object now says how many
memory tiers were built. Thanks to Antoine Morvan for the report.
+ Fix the management of infinite cpusets in the bitmap printf/sscanf
API as well as in command-line tools.
+ Add section "Compiling software on top of hwloc's C API" in the
documentation with examples for GNU Make and CMake,
thanks to Florent Pruvost for the help.
Version 2.10.0
--------------
* Heterogeneous Memory core improvements
+ Better heuristics to identify the subtype of memory such as HBM,
DRAM, NVM, CXL-DRAM, etc.
+ Build memory tiers, i.e. sets of NUMA nodes with the same subtype
and similar performance.
- NUMA node tier ranks are exposed in the new MemoryTier info
attribute (starts from 0 for highest bandwidth tier)..
+ See the new Heterogeneous Memory section in the documentation.
* API
+ Add hwloc_topology_free_group_object() to discard a Group created
by hwloc_topology_alloc_group_object().
* Linux backend
+ Fix cpukinds on NVIDIA Grace to report identical cores even if they
actually have very small frequency differences.
Thanks to John C. Linford for the report.
+ Add CXLDevice attributes to CXL DAX objects and NUMA nodes to show
which PCI device implements which window.
+ Ignore buggy memory-side caches and memory attributes when fake NUMA
emulation is enabled on the Linux kernel command-line.
+ Add more info attributes in MemoryModule Misc objects,
thanks to Zubiao Xiong for the patch.
+ Get CPUModel and CPUFamily info attributes on LoongArch platforms.
* x86 backend
+ Add support for new AMD CPUID leaf 0x80000026 for better detection
of Core Complex and Die on Zen4 processors.
+ Improve Zhaoxin CPU topology detection.
* Tools
+ Input locations and many command-line options (e.g. hwloc-calc -I -N -H,
lstopo --only) now accept filters such as "NUMA[HBM]" so that only
objects are that type and subtype are considered.
- NUMA[tier=1] is also accepted for selecting NUMA nodes depending
on their MemoryTier info attribute.
+ Add --object-output to hwloc-calc to report the type as a prefix to
object indexes, e.g. Core:2 instead of 2 in the output of -I.
+ hwloc-info --ancestor and --descendants now accepts kinds of objects
instead of single types.
- The new --first option only shows the first matching object.
+ Add --children-of-pid to hwloc-ps to show a hierarchy of processes.
Thanks to Antoine Morvan for the suggestion.
+ Add --misc-from to lstopo to add Misc objects described in a file.
- To be combined with the new hwloc-ps --lstopo-misc for a customizable
lstopo --top replacement.
* Misc
+ lstopo may now configure the layout of memory object placed above,
for instance with --children-order memory:above:vert.
+ Fix XML import from memory or stdin when using libxml2 2.12.
+ Fix installation failures when configuring with --target,
thanks to Clement Foyer for the patch.
+ Fix support for 128bit pointer architectures.
+ Remove Netloc.
Version 2.9.3
-------------
* Handle Linux glibc allocation errors in binding routines (CVE-2022-47022).
* Fix hwloc-calc when searching objects on heterogeneous memory platforms,
thanks to Antoine Morvan for the report.
* Fix hwloc_get_next_child() when there are some memory-side caches.
* Don't crash if the topology is empty because Linux cgroups are wrong.
* Improve some hwloc-bind warnings in case of command-line parsing errors.
* Many documentation improvements all over the place, including:
+ hwloc_topology_restrict() and hwloc_topology_insert_group() may reorder
children, causing the logical indexes of objects to change.
Version 2.9.2
-------------
* Don't forget L3i when defining filters for multiple levels of caches
with hwloc_topology_set_cache/icache_types_filter().
* Fix object total_memory after hwloc_topology_insert_group_object().
* Fix the (non-yet) exporting in synthetic description for complex memory
hierarchies with memory-side caches, etc.
* Fix some default size attributes when building synthetic topologies.
* Fix size units in hwloc-annotate.
* Improve bitmap reallocation error management in many functions.
* Documentation improvements:
+ Better document return values of functions.
+ Add "Error reporting" section (in hwloc.h and in the doxygen doc).
+ Add FAQ entry "What may I disable to make hwloc faster?"
+ Improve FAQ entries "Why is lstopo slow?" and
"I only need ..., why should I use hwloc?"
+ Clarify how to deal with cpukinds in hwloc-calc and hwloc-bind
manpages.
Version 2.9.1
-------------
* Don't forget to apply object type filters to "perflevel" caches detected
on recent Mac OS X releases, thanks to Michel Lesoinne for the report.
* Fix a failed assertion in hwloc_topology_restrict() when some NUMA nodes
are removed because of HWLOC_RESTRICT_FLAG_REMOVE_CPULESS but no PUs are.
Thanks to Mark Grondona for reporting the issue.
* Mark HPE Cray Slingshot NICs with subtype "Slingshot".
Version 2.9.0
-------------
* Backends
+ Expose the memory size of CXL memory devices (Type 3) on Linux.
+ The LevelZero backend now reports the "XeLinkBandwidth" distance
matrix between L0 devices (and subdevices) when available.
+ Add support for CUDA compute capability up to 9.0.
* Tools
+ lstopo now switches to console mode when its output is redirected.
Graphical window mode may be forced back with --of window.
+ hwloc-calc now accepts "numa" in -H, and I/O subtypes such as "gpu"
in -I and -N.
Version 2.8.0
-------------
* API
+ Add HWLOC_TOPOLOGY_FLAG_NO_DISTANCES, _NO_MEMATTRS and _NO_CPUKINDS
to reduce the overhead when unneeded.
+ Add separate Read/Write Bandwidth/Latency memory attributes and
implement them on Linux.
* Backends
+ NUMA nodes may now have a subtype such as DRAM, HBM, SPM, or NVM
on heterogeneous memory platforms on Linux.
- Add DAXType and DAXParent attributes on Linux to tell where a
DAX device or its corresponding NUMA node come from (SPM for
Specific-Purpose or NVM for Non-Volatile Memory).
+ Detect heterogeneous caches in hybrid CPUs on MacOS X,
thanks to Paul Bone for the help.
+ Max frequencies are not ignored in Linux cpukinds anymore (they were
ignored in hwloc 2.7.0), but they may be slightly adjusted to avoid
reporting hybrid CPUs because Intel Turbo Boost Max 3.0.
- See the documentation of environment variable HWLOC_CPUKINDS_MAXFREQ.
+ Hardwire the PCI locality of HPE Cray EX235a nodes.
* Tools
+ lstopo and other tools may now load Linux and x86 cpuid topology files
from a tarball.
+ lstopo may now replace the P# and L# index prefixes with custom strings
thanks to --os-index-prefix and --logical-index-prefix options.
* Misc
+ Add --disable-readme to avoid regenerating the top-level hwloc README
file from the documentation.
Version 2.7.2
-------------
* Fix a crash when LevelZero devices have multiple subdevices,
e.g. on PonteVecchio GPUs, thanks to Jonathan Peyton.
* Fix a leak when importing cpukinds from XML,
thanks to Hui Zhou.
Version 2.7.1
-------------
* Workaround crashes when virtual machines report incoherent x86 CPUID
information about numbers of cores and threads.
Thanks to Peter Bense for the report.
* Use setenv() instead of putenv() when trying to force enable oneAPI L0
support, to avoid issues with applications that touch the environment,
thanks to Josh Hursey for the patch.
* Add some warnings at the end of configure when GPU libraries are
missing on the system or their path is missing in the environment.
Version 2.7.0
-------------
* Backends
+ Add support for NUMA nodes and caches with more than 64 PUs across
multiple processor groups on Windows 11 and Windows Server 2022.
+ Group objects are not created for Windows processor groups anymore,
except if HWLOC_WINDOWS_PROCESSOR_GROUP_OBJS=1 in the environment.
+ Expose "Cluster" group objects on Linux kernel 5.16+ for CPUs
that share some internal cache or bus. This can be equivalent
to the L2 Cache level on some platforms (e.g. x86) or a specific
level between L2 and L3 on others (e.g. ARM Kungpeng 920).
Thanks to Jonathan Cameron for the help.
- HWLOC_DONT_MERGE_CLUSTER_GROUPS=1 may be set in the environment
to prevent these groups from being merged with identical caches, etc.
+ Improve the oneAPI LevelZero backend:
- Expose subdevices such as "ze0.1" inside root OS devices ("ze0")
when the hardware contains multiple subdevices.
- Add many new attributes to describe device type, and the
numbers of slices, subslices, execution units and threads.
- Expose the memory information as LevelZeroHBM/DDR/MemorySize infos.
+ Ignore the max frequencies of cores in Linux cpukinds when the
base frequencies are available (to avoid exposing hybrid CPUs
when Intel Turbo Boost Max 3.0 gives slightly different max
frequencies to CPU cores).
- May be reverted by setting HWLOC_CPUKINDS_MAXFREQ=1 in the environment.
* Tools
+ Add --grey and --palette options to switch lstopo to greyscale or
white-background-only graphics, or to tune individual colors.
* Build
+ Windows CMake builds now support non-MSVC compilers, detect several
features at build time, can build/run tests, etc.
Thanks to Michael Hirsch and Alexander Neumann .
Version 2.6.0
-------------
* Backends
+ Expose two cpukinds for energy-efficient cores (icestorm) and
high-performance cores (firestorm) on Apple M1 on Mac OS X.
+ Use sysfs CPU "capacity" to rank hybrid cores by efficiency
on Linux when available (mostly on recent ARM platforms for now).
+ Improve HWLOC_MEMBIND_BIND (without the STRICT flag) on Linux kernel
>= 5.15: If more than one node is given, the kernel may now use all
of them instead of only the first one before falling back to others.
+ Expose cache os_index when available on Linux, it may be needed
when using resctrl to configure cache partitioning, memory bandwidth
monitoring, etc.
+ Add a "XGMIHops" distances matrix in the RSMI backend for AMD GPU
interconnected through XGMI links.
+ Expose AMD GPU memory information (VRAM and GTT) in the RSMI backend.
+ Add OS devices such as "bxi0" for Atos/Bull BXI HCAs on Linux.
* Tools
+ lstopo has a better placement algorithm with respect to I/O
objects, see --children-order in the manpage for details.
+ hwloc-annotate may now change object subtypes and cache or memory
sizes.
* Build
+ Allow to specify the ROCm installation for building the RSMI backend:
- Use a custom installation path if specified with --with-rocm=<dir>.
- Use /opt/rocm-<version> if specified with --with-rocm-version=<version>
or the ROCM_VERSION environment variable.
- Try /opt/rocm if it exists.
- See "How do I enable ROCm SMI and select which version to use?"
in the FAQ for details.
+ Add a CMakeLists for Windows under contrib/windows-cmake/ .
* Documentation
+ Add FAQ entry "How do I create a custom heterogeneous and
asymmetric topology?"
Version 2.5.0
-------------
* API
+ Add hwloc/windows.h to query Windows processor groups.
+ Add hwloc_get_obj_with_same_locality() to convert between objects
with same locality, for instance NUMA nodes and Packages,
or OS devices within a PCI device.
+ Add hwloc_distances_transform() to modify distances structures.
- hwloc-annotate and lstopo have new distances-transform options.
+ hwloc_distances_add() is replaced with _add_create() followed by
_add_values() and _add_commit(). See hwloc/distances.h for details.
+ Add topology flags to mitigate binding modifications during
hwloc discovery, especially on Windows:
- HWLOC_TOPOLOGY_FLAG_RESTRICT_TO_CPUBINDING and _MEMBINDING
restrict discovery to PUs and NUMA nodes inside the binding.
- HWLOC_TOPOLOGY_FLAG_DONT_CHANGE_BINDING prevents from ever
changing the binding during discovery.
* Backends
+ Add a levelzero backend for oneAPI L0 devices, exposed as OS devices
of subtype "LevelZero" and name such as "ze0".
- Add hwloc/levelzero.h for interoperability between converting
between L0 API devices and hwloc cpusets or OS devices.
+ Expose NEC Vector Engine cards on Linux as OS devices of subtype
"VectorEngine" and name "ve0", etc.
Thanks to Anara Kozhokanova, Tim Cramer and Erich Focht for the help.
+ Add a NVLinkBandwidth distances structure between NVIDIA GPUs
(and POWER processor or NVSwitches) in the NVML backend,
and a XGMIBandwidth distances structure between AMD GPUs
in the RSMI backends.
- See "Topology Attributes: Distances, Memory Attributes and CPU Kinds"
in the documentation for details about these new distances.
+ Add support for NUMA node 0 being offline in Linux, thanks to Jirka Hladky.
* Build
+ Add --with-cuda-version=<version> or look at the CUDA_VERSION
environment variable to find the appropriate CUDA pkg-config files.
Thanks to Stephen Herbein for the suggestion.
- Also add --with-cuda=<dir> to specify the CUDA installation path
manually (and its NVML and OpenCL components).
Thanks to Andrea Bocci for the suggestion.
- See "How do I enable CUDA and select which CUDA version to use?"
in the FAQ for details.
* Tools
+ lstopo now has a --windows-processor-groups option on Windows.
+ hwloc-ps now has a --short-name option to avoid long/truncated
command path.
+ hwloc-ps now has a --single-ancestor option to return a single
(possibly too large) object where a process is bound.
+ hwloc-ps --pid-cmd may now query environment variables,
including MPI-specific variables to find out process ranks.
Version 2.4.1
-------------
* Fix AMD OpenCL device locality when PCI bus or device number >= 128.
Thanks to Edgar Leon for reporting the issue.
+ Applications using any of the following inline functions must
be recompiled to get the fix: hwloc_opencl_get_device_pci_busid()
hwloc_opencl_get_device_cpuset(), hwloc_opencl_get_device_osdev().
* Fix the ranking of cpukinds on non-Windows systems,
thanks to Ivan Kochin for the report.
* Fix the insertion of custom Groups after loading the topology,
thanks to Scott Hicks.
* Add support for CPU0 being offline in Linux, thanks to Garrett Clay.
* Fix missing x86 Package and Core objects FreeBSD/NetBSD.
Thanks to Thibault Payet and Yuri Victorovich for the report.
* Fix the import of very large distances with heterogeneous object types.
* Fix a memory leak in the Linux backend,
thanks to Perceval Anichini.
Version 2.4.0
-------------
* API
+ Add hwloc/cpukinds.h for reporting information about hybrid CPUs.
- Use Linux cpufreq frequencies to rank cores by efficiency.
- Use x86 CPUID hybrid leaf and future Linux kernels sysfs CPU type
files to identify Intel Atom and Core cores.
- Use the Windows native EfficiencyClass to separate kinds.
* Backends
+ Properly handle Linux kernel 5.10+ exposing ACPI HMAT information
with knowledge of Generic Initiators.
* Tools
+ lstopo has new --cpukinds and --no-cpukinds options for showing
CPU kinds or not in textual and graphical modes respectively.
+ hwloc-calc has a new --cpukind option for filtering PUs by kind.
+ hwloc-annotate has a new cpukind command for modifying CPU kinds.
* Misc
+ Fix hwloc_bitmap_nr_ulongs(), thanks to Norbert Eicker.
+ Add a documentation section about
"Topology Attributes: Distances, Memory Attributes and CPU Kinds".
+ Silence some spurious warnings in the OpenCL backend and when showing
process binding with lstopo --ps.
Version 2.3.0
-------------
* API
+ Add hwloc/memattrs.h for exposing latency/bandwidth information
between initiators (CPU sets for now) and target NUMA nodes,
typically on heterogeneous platforms.
- When available, bandwidths and latencies are read from the ACPI HMAT
table exposed by Linux kernel 5.2+.
- Attributes may also be customized to expose user-defined performance
information.
+ Add hwloc_get_local_numanode_objs() for listing NUMA nodes that are
local to some locality.
+ The new topology flag HWLOC_TOPOLOGY_FLAG_IMPORT_SUPPORT causes
support arrays to be loaded from XML exported with hwloc 2.3+.
- hwloc_topology_get_support() now returns an additional "misc"
array with feature "imported_support" set when support was imported.
+ Add hwloc_topology_refresh() to refresh internal caches after modifying
the topology and before consulting the topology in a multithread context.
* Backends
+ Add a ROCm SMI backend and a hwloc/rsmi.h helper file for getting
the locality of AMD GPUs, now exposed as "rsmi" OS devices.
Thanks to Mike Li.
+ Remove POWER device-tree-based topology on Linux,
(it was disabled by default since 2.1).
* Tools
+ Command-line options for specifying flags now understand comma-separated
lists of flag names (substrings).
+ hwloc-info and hwloc-calc have new --local-memory --local-memory-flags
and --best-memattr options for reporting local memory nodes and filtering
by memory attributes.
+ hwloc-bind has a new --best-memattr option for filtering by memory attributes
among the memory binding set.
+ Tools that have a --restrict option may now receive a nodeset or
some custom flags for restricting the topology.
+ lstopo now has a --thickness option for changing line thickness in the
graphical output.
+ Fix lstopo drawing when autoresizing on Windows 10.
+ Pressing the F5 key in lstopo X11 and Windows graphical/interactive outputs
now refreshes the display according to the current topology and binding.
+ Add a tikz lstopo graphical backend to generate picture easily included into
LaTeX documents. Thanks to Clement Foyer.
* Misc
+ The default installation path of the Bash completion file has changed to
${datadir}/bash-completion/completions/hwloc. Thanks to Tomasz Kłoczko.
Version 2.2.0
-------------
* API
+ Add hwloc_bitmap_singlify_by_core() to remove SMT from a given cpuset,
thanks to Florian Reynier for the suggestion.
+ Add --enable-32bits-pci-domain to stop ignoring PCI devices with domain
>16bits (e.g. 10000:02:03.4). Enabling this option breaks the library ABI.
Thanks to Dylan Simon for the help.
* Backends
+ Add support for Linux cgroups v2.
+ Add NUMA support for FreeBSD.
+ Add get_last_cpu_location support for FreeBSD.
+ Remove support for Intel Xeon Phi (MIC, Knights Corner) co-processors.
* Tools
+ Add --uid to filter the hwloc-ps output by uid on Linux.
+ Add a GRAPHICAL OUTPUT section in the manpage of lstopo.
* Misc
+ Use the native dlopen instead of libltdl,
unless --disable-plugin-dlopen is passed at configure time.
Version 2.1.0
-------------
* API
+ Add a new "Die" object (HWLOC_OBJ_DIE) for upcoming x86 processors
with multiple dies per package, in the x86 and Linux backends.
+ Add the new HWLOC_OBJ_MEMCACHE object type for memory-side caches.
- They are filtered-out by default, except in command-line tools.
- They are only available on very recent platforms running Linux 5.2+
and uptodate ACPI tables.
- The KNL MCDRAM in cache mode is still exposed as a L3 unless
HWLOC_KNL_MSCACHE_L3=0 in the environment.
+ Add HWLOC_RESTRICT_FLAG_BYNODESET and _REMOVE_MEMLESS for restricting
topologies based on some memory nodes.
+ Add hwloc_topology_set_components() for blacklisting some components
from being enabled in a topology.
+ Add hwloc_bitmap_nr_ulongs() and hwloc_bitmap_from/to_ulongs(),
thanks to Junchao Zhang for the suggestion.
+ Improve the API for dealing with disallowed resources
- HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM is replaced with FLAG_INCLUDE_DISALLOWED
and --whole-system command-line options with --disallowed.
. Former names are still accepted for backward compatibility.
- Add hwloc_topology_allow() for changing allowed sets after load().
- Add the HWLOC_ALLOW=all environment variable to totally ignore
administrative restrictions such as Linux Cgroups.
- Add disallowed_pu and disallowed_numa bits to the discovery support
structure.
+ Group objects have a new "dont_merge" attribute to prevent them from
being automatically merged with identical parent or children.
+ Add more distances-related features:
- Add hwloc_distances_get_name() to retrieve a string describing
what a distances structure contain.
- Add hwloc_distances_get_by_name() to retrieve distances structures
based on their name.
- Add hwloc_distances_release_remove()
- Distances may now cover objects of different types with new kind
HWLOC_DISTANCES_KIND_HETEROGENEOUS_TYPES.
* Backends
+ Add support for Linux 5.3 new sysfs cpu topology files with Die information.
+ Add support for Intel v2 Extended Topology Enumeration in the x86 backend.
+ Improve memory locality on Linux by using HMAT initiators (exposed
since Linux 5.2+), and NUMA distances for CPU-less NUMA nodes.
+ The x86 backend now properly handles offline CPUs.
+ Detect the locality of NVIDIA GPU OpenCL devices.
+ Ignore NUMA nodes that correspond to NVIDIA GPU by default.
- They may be unignored if HWLOC_KEEP_NVIDIA_GPU_NUMA_NODES=1 in the environment.
- Fix their CPU locality and add info attributes to identify them.
Thanks to Max Katz and Edgar Leon for the help.
+ Add support for IBM S/390 drawers.
+ Rework the heuristics for discovering KNL Cluster and Memory modes
to stop assuming all CPUs are online (required for mOS support).
Thanks to Sharath K Bhat for testing patches.
+ Ignore NUMA node information from AMD topoext in the x86 backend,
unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment.
+ Expose Linux DAX devices as hwloc Block OS devices.
+ Remove support for /proc/cpuinfo-only topology discovery in Linux
kernel prior to 2.6.16.
+ Disable POWER device-tree-based topology on Linux by default.
- It may be reenabled by setting HWLOC_USE_DT=1 in the environment.
+ Discovery components are now divided in phases that may be individually
blacklisted.
- The linuxio component has been merged back into the linux component.
* Tools
+ lstopo
- lstopo factorizes objects by default in the graphical output when
there are more than 4 identical children.
. New options --no-factorize and --factorize may be used to configure this.
. Hit the 'f' key to disable factorizing in interactive outputs.
- Both logical and OS/physical indexes are now displayed by default
for PU and NUMA nodes.
- The X11 and Windows interactive outputs support many keyboard
shortcuts to dynamically customize the attributes, legend, etc.
- Add --linespacing and change default margins and linespacing.
- Add --allow for changing allowed sets.
- Add a native SVG backend. Its graphical output may be slightly less
pretty than Cairo (still used by default if available) but the SVG
code provides attributes to manipulate objects from HTML/JS.
See dynamic_SVG_example.html for an example.
+ Add --nodeset options to hwloc-calc for converting between cpusets and
nodesets.
+ Add --no-smt to lstopo, hwloc-bind and hwloc-calc to ignore multiple
PU in SMT cores.
+ hwloc-annotate may annotate multiple locations at once.
+ Add a HTML/JS version of hwloc-ps. See contrib/hwloc-ps.www/README.
+ Add bash completions.
* Misc
+ Add several FAQ entries in "Compatibility between hwloc versions"
about API version, ABI, XML, Synthetic strings, and shmem topologies.
Version 2.0.4 (also included in 1.11.13 when appropriate)
-------------
* Add support for Linux 5.3 new sysfs cpu topology files with Die information.
* Add support for Intel v2 Extended Topology Enumeration in the x86 backend.
* Tiles, Modules and Dies are exposed as Groups for now.
+ HWLOC_DONT_MERGE_DIE_GROUPS=1 may be set in the environment to prevent
Die groups from being automatically merged with identical parent or children.
* Ignore NUMA node information from AMD topoext in the x86 backend,
unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment.
* Group objects have a new "dont_merge" attribute to prevent them from
being automatically merged with identical parent or children.
Version 2.0.3 (also included in 1.11.12 when appropriate)
-------------
* Fix build on Cygwin, thanks to Marco Atzeri for the patches.
* Fix a corner case of hwloc_topology_restrict() where children would
become out-of-order.
* Fix the return length of export_xmlbuffer() functions to always
include the ending \0.
* Fix lstopo --children-order argument parsing.
Version 2.0.2 (also included in 1.11.11 when appropriate)
-------------
* Add support for Hygon Dhyana processors in the x86 backend,
thanks to Pu Wen for the patch.
* Fix symbol renaming to also rename internal components,
thanks to Evan Ramos for the patch.
* Fix build on HP-UX, thanks to Richard Lloyd for reporting the issues.
* Detect PCI link speed without being root on Linux >= 4.13.
* Add HWLOC_VERSION* macros to the public headers,
thanks to Gilles Gouaillardet for the suggestion.
Version 2.0.1 (also included in 1.11.10 when relevant)
-------------
* Bump the library soname to 15:0:0 to avoid conflicts with hwloc 1.11.x
releases. The hwloc 2.0.0 soname was buggy (12:0:0), applications will
have to be recompiled.
* Serialize pciaccess discovery to fix concurrent topology loads in
multiple threads.
* Fix hwloc-dump-hwdata to only process SMBIOS information that correspond
to the KNL and KNM configuration.
* Add a heuristic for guessing KNL/KNM memory and cluster modes when
hwloc-dump-hwdata could not run as root earlier.
* Add --no-text lstopo option to remove text from some boxes in the
graphical output. Mostly useful for removing Group labels.
* Some minor fixes to memory binding.
Version 2.0.0
-------------
*** The ABI of the library has changed. ***
For instance some hwloc_obj fields were reordered, added or removed, see below.
+ HWLOC_API_VERSION and hwloc_get_api_version() now give 0x00020000.
+ See "How do I handle ABI breaks and API upgrades ?" in the FAQ
and "Upgrading to hwloc 2.0 API" in the documentation.
* Major API changes
+ Memory, I/O and Misc objects are now stored in dedicated children lists,
not in the usual children list that is now only used for CPU-side objects.
- hwloc_get_next_child() may still be used to iterate over these 4 lists
of children at once.
- hwloc_obj_type_is_normal(), _memory() and _io() may be used to check
the kind of a given object type.
+ Topologies always have at least one NUMA object. On non-NUMA machines,
a single NUMA object is added to describe the entire machine memory.
The NUMA level cannot be ignored anymore.
+ The NUMA level is special since NUMA nodes are not in the main hierarchy
of objects anymore. Its depth is a fake negative depth that should not be
compared with normal levels.
- If all memory objects are attached to parents at the same depth,
it may be retrieved with hwloc_get_memory_parents_depth().
+ The HWLOC_OBJ_CACHE type is replaced with 8 types HWLOC_OBJ_L[1-5]CACHE
and HWLOC_OBJ_L[1-3]ICACHE that remove the need to disambiguate levels
when looking for caches with _by_type() functions.
- New hwloc_obj_type_is_{,d,i}cache() functions may be used to check whether
a given type is a cache.
+ Reworked ignoring/filtering API
- Replace hwloc_topology_ignore*() functions with hwloc_topology_set_type_filter()
and hwloc_topology_set_all_types_filter().
. Contrary to hwloc_topology_ignore_{type,all}_keep_structure() which
removed individual objects, HWLOC_TYPE_FILTER_KEEP_STRUCTURE only removes
entire levels (so that topology do not become too asymmetric).
- Remove HWLOC_TOPOLOGY_FLAG_ICACHES in favor of hwloc_topology_set_icache_types_filter()
with HWLOC_TYPE_FILTER_KEEP_ALL.
- Remove HWLOC_TOPOLOGY_FLAG_IO_DEVICES, _IO_BRIDGES and _WHOLE_IO in favor of
hwloc_topology_set_io_types_filter() with HWLOC_TYPE_FILTER_KEEP_ALL or
HWLOC_TYPE_FILTER_KEEP_IMPORTANT.
+ The distance API has been completely reworked. It is now described
in hwloc/distances.h.
+ Return values
- Most functions in hwloc/bitmap.h now return an int that may be negative
in case of failure to realloc/extend the internal storage of a bitmap.
- hwloc_obj_add_info() also returns an int in case allocations fail.
* Minor API changes
+ Object attributes
- obj->memory is removed.
. local_memory and page_types attributes are now in obj->attr->numanode
. total_memory moves obj->total_memory.
- Objects do not have allowed_cpuset and allowed_nodeset anymore.
They are only available for the entire topology using
hwloc_topology_get_allowed_cpuset() and hwloc_topology_get_allowed_nodeset().
- Objects now have a "subtype" field that supersedes former "Type" and
"CoProcType" info attributes.
+ Object and level depths are now signed ints.
+ Object string printing and parsing
- hwloc_type_sscanf() deprecates the old hwloc_obj_type_sscanf().
- hwloc_type_sscanf_as_depth() is added to convert a type name into
a level depth.
- hwloc_obj_cpuset_snprintf() is deprecated in favor of hwloc_bitmap_snprintf().
+ Misc objects
- Replace hwloc_topology_insert_misc_object_by_cpuset() with
hwloc_topology_insert_group_object() to precisely specify the location
of an additional hierarchy level in the topology.
- Misc objects have their own level and depth to iterate over all of them.
- Misc objects may now only be inserted as a leaf object with
hwloc_topology_insert_misc_object() which deprecates
hwloc_topology_insert_misc_object_by_parent().
+ hwloc_topology_restrict() doesn't remove objects that contain memory
by default anymore.
- The list of existing restrict flags was modified.
+ The discovery support array now contains some NUMA specific bits.
+ XML export functions take an additional flags argument,
for instance for exporting XMLs that are compatible with hwloc 1.x.
+ Functions diff_load_xml*(), diff_export_xml*() and diff_destroy() in
hwloc/diff.h do not need a topology as first parameter anymore.
+ hwloc_parse_cpumap_file () superseded by hwloc_linux_read_path_as_cpumask()
in hwloc/linux.h.
+ HWLOC_MEMBIND_DEFAULT and HWLOC_MEMBIND_FIRSTTOUCH were clarified.
* New APIs and Features
+ Add hwloc/shmem.h for sharing topologies between processes running on
the same machine (for reducing the memory footprint).
+ Add the experimental netloc subproject. It is disabled by default
and can be enabled with --enable-netloc.
It currently brings command-line tools to gather and visualize the
topology of InfiniBand fabrics, and an API to convert such topologies
into Scotch architectures for process mapping.
See the documentation for details.
* Removed APIs and features
+ Remove the online_cpuset from struct hwloc_obj. Offline PUs get unknown
topologies on Linux nowadays, and wrong topology on Solaris. Other OS
do not support them. And one cannot do much about them anyway. Just keep
them in complete_cpuset.
+ Remove the now-unused "System" object type HWLOC_OBJ_SYSTEM,
defined to MACHINE for backward compatibility.
+ The almost-unused "os_level" attribute has been removed from the
hwloc_obj structure.
+ Remove the custom interface for assembling the topologies of different
nodes as well as the hwloc-assembler tools.
+ hwloc_topology_set_fsroot() is removed, the environment variable
HWLOC_FSROOT may be used for the same remote testing/debugging purpose.
+ Remove the deprecated hwloc_obj_snprintf(), hwloc_obj_type_of_string(),
hwloc_distribute[v]().
* Remove Myrinet Express interoperability (hwloc/myriexpress.h).
+ Remove Kerrighed support from the Linux backend.
+ Remove Tru64 (OSF/1) support.
- Remove HWLOC_MEMBIND_REPLICATE which wasn't available anywhere else.
* Backend improvements
+ Linux
- OS devices do not have to be attached through PCI anymore,
for instance enabling the discovery of NVDIMM block devices.
- Remove the dependency on libnuma.
- Add a SectorSize attribute to block OS devices.
+ Mac OS X
- Fix detection of cores and hyperthreads.
- Add CPUVendor, Model, ... attributes.
+ Windows
- Add get_area_memlocation().
* Tools
+ lstopo and hwloc-info have a new --filter option matching the new filtering API.
+ lstopo can be given --children-order=plain to force a basic displaying
of memory and normal children together below their parent.
+ hwloc-distances was removed and replaced with lstopo --distances.
* Misc
+ Exports
- Exporting to synthetic now ignores I/O and Misc objects.
+ PCI discovery
- Separate OS device discovery from PCI discovery. Only the latter is disabled
with --disable-pci at configure time. Both may be disabled with --disable-io.
- The `linuxpci' component is now renamed into `linuxio'.
- The old `libpci' component name from hwloc 1.6 is not supported anymore,
only the `pci' name from hwloc 1.7 is now recognized.
- The HWLOC_PCI_<domain>_<bus>_LOCALCPUS environment variables are superseded
with a single HWLOC_PCI_LOCALITY where bus ranges may be specified.
- Do not set PCI devices and bridges name automatically. Vendor and device
names are already in info attributes.
+ Components and discovery
- Add HWLOC_SYNTHETIC environment variable to enforce a synthetic topology
as if hwloc_topology_set_synthetic() had been called.
- HWLOC_COMPONENTS doesn't support xml or synthetic component attributes
anymore, they should be passed in HWLOC_XMLFILE or HWLOC_SYNTHETIC instead.
- HWLOC_COMPONENTS takes precedence over other environment variables
for selecting components.
+ hwloc now requires a C99 compliant compiler.
Version 1.11.13 (also included in 2.0.4)
---------------
* Add support for Linux 5.3 new sysfs cpu topology files with Die information.
* Add support for Intel v2 Extended Topology Enumeration in the x86 backend.
* Tiles, Modules and Dies are exposed as Groups for now.
+ HWLOC_DONT_MERGE_DIE_GROUPS=1 may be set in the environment to prevent
Die groups from being automatically merged with identical parent or children.
* Ignore NUMA node information from AMD topoext in the x86 backend,
unless HWLOC_X86_TOPOEXT_NUMANODES=1 is set in the environment.
* Group objects have a new "dont_merge" attribute to prevent them from
being automatically merged with identical parent or children.
Version 1.11.12 (also included in 2.0.3)
---------------
* Fix a corner case of hwloc_topology_restrict() where children would
become out-of-order.
* Fix the return length of export_xmlbuffer() functions to always
include the ending \0.
Version 1.11.11 (also included in 2.0.2)
---------------
* Add support for Hygon Dhyana processors in the x86 backend,
thanks to Pu Wen for the patch.
* Fix symbol renaming to also rename internal components,
thanks to Evan Ramos for the patch.
* Fix build on HP-UX, thanks to Richard Lloyd for reporting the issues.
* Detect PCI link speed without being root on Linux >= 4.13.
Version 1.11.10 (also included in 2.0.1)
---------------
* Fix detection of cores and hyperthreads on Mac OS X.
* Serialize pciaccess discovery to fix concurrent topology loads in
multiple threads.
* Fix first touch area memory binding on Linux when thread memory
binding is different.
* Some minor fixes to memory binding.
* Fix hwloc-dump-hwdata to only process SMBIOS information that correspond
to the KNL and KNM configuration.
* Add a heuristic for guessing KNL/KNM memory and cluster modes when
hwloc-dump-hwdata could not run as root earlier.
* Fix discovery of NVMe OS devices on Linux >= 4.0.
* Add get_area_memlocation() on Windows.
* Add CPUVendor, Model, ... attributes on Mac OS X.
Version 1.11.9
--------------
* Add support for Zhaoxin ZX-C and ZX-D processors in the x86 backend,
thanks to Jeff Zhao for the patch.
* Fix AMD Epyc 24-core L3 cache locality in the x86 backend.
* Don't crash in the x86 backend when the CPUID vendor string is unknown.
* Fix the missing pu discovery support bit on some OS.
* Fix the management of the lstopoStyle info attribute for custom colors.
* Add verbose warnings when failing to load hwloc v2.0+ XMLs.
Version 1.11.8
--------------
* Multiple Solaris improvements, thanks to Maureen Chew for the help:
+ Detect caches on Sparc.
+ Properly detect allowed/disallowed PUs and NUMA nodes with processor sets.
+ Add hwloc_get_last_cpu_location() support for the current thread.
* Add support for CUDA compute capability 7.0 and fix support for 6.[12].
* Tools improvements
+ Fix search for objects by physical index in command-line tools.
+ Add missing "cpubind:get_thisthread_last_cpu_location" in the output
of hwloc-info --support.
+ Add --pid and --name to specify target processes in hwloc-ps.
+ Display thread names in lstopo and hwloc-ps on Linux.
* Doc improvements
+ Add a FAQ entry about building on Windows.
+ Install missing sub-manpage for hwloc_obj_add_info() and
hwloc_obj_get_info_by_name().
Version 1.11.7
--------------
* Fix hwloc-bind --membind for CPU-less NUMA nodes (again).
Thanks to Gilles Gouaillardet for reporting the issue.
* Fix a memory leak on IBM S/390 platforms running Linux.
* Fix a memory leak when forcing the x86 backend first on amd64/topoext
platforms running Linux.
* Command-line tools now support "hbm" instead "numanode" for filtering
only high-bandwidth memory nodes when selecting locations.
+ hwloc-bind also support --hbm and --no-hbm for filtering only or
no HBM nodes.
Thanks to Nicolas Denoyelle for the suggestion.
* Add --children and --descendants to hwloc-info for listing object
children or object descendants of a specific type.
* Add --no-index, --index, --no-attrs, --attrs to disable/enable display
of index numbers or attributes in the graphical lstopo output.
* Try to gather hwloc-dump-hwdata output from all possible locations
in hwloc-gather-topology.
* Updates to the documentation of locations in hwloc(7) and
command-line tools manpages.
Version 1.11.6
--------------
* Make the Linux discovery about twice faster, especially on the CPU side,
by trying to avoid sysfs file accesses as much as possible.
* Add support for AMD Family 17h processors (Zen) SMT cores in the Linux
and x86 backends.
* Add the HWLOC_TOPOLOGY_FLAG_THISSYSTEM_ALLOWED_RESOURCES flag (and the
HWLOC_THISSYSTEM_ALLOWED_RESOURCES environment variable) for reading the
set of allowed resources from the local operating system even if the
topology was loaded from XML or synthetic.
* Fix hwloc_bitmap_set/clr_range() for infinite ranges that do not
overlap currently defined ranges in the bitmap.
* Don't reset the lstopo zoom scale when moving the X11 window.
* lstopo now has --flags for manually setting topology flags.
* hwloc_get_depth_type() returns HWLOC_TYPE_DEPTH_UNKNOWN for Misc objects.
Version 1.11.5
--------------
* Add support for Knights Mill Xeon Phi, thanks to Piotr Luc for the patch.
* Reenable distance gathering on Solaris, disabled by mistake since v1.0.
Thanks to TU Wien for the help.
* Fix hwloc_get_*obj*_inside_cpuset() functions to ignore objects with
empty CPU sets, for instance, CPU-less NUMA nodes such as KNL MCDRAM.
Thanks to Nicolas Denoyelle for the report.
* Fix XML import of multiple distance matrices.
* Add a FAQ entry about "hwloc is only a structural model, it ignores
performance models, memory bandwidth, etc.?"
Version 1.11.4
--------------
* Add MemoryMode and ClusterMode attributes in the Machine object on KNL.
Add doc/examples/get-knl-modes.c for an example of retrieving them.
Thanks to Grzegorz Andrejczuk.
* Fix Linux build with -m32 with respect to libudev.
Thanks to Paul Hargrove for reporting the issue.
* Fix build with Visual Studio 2015, thanks to Eloi Gaudry for reporting
the issue and providing the patch.
* Don't forget to display OS device children in the graphical lstopo.
* Fix a memory leak on Solaris, thanks to Bryon Gloden for the patch.
* Properly handle realloc() failures, thanks to Bryon Gloden for reporting
the issue.
* Fix lstopo crash in ascii/fig/windows outputs when some objects have a
lstopoStyle info attribute.
Version 1.11.3
--------------
* Bug fixes
+ Fix a memory leak on Linux S/390 hosts with books.
+ Fix /proc/mounts parsing on Linux by using mntent.h.
Thanks to Nathan Hjelm for reporting the issue.
+ Fix a x86 infinite loop on VMware due to the x2APIC feature being
advertised without actually being fully supported.
Thanks to Jianjun Wen for reporting the problem and testing the patch.
+ Fix the return value of hwloc_alloc() on mmap() failure.
Thanks to Hugo Brunie for reporting the issue.
+ Fix the return value of command-line tools in some error cases.
+ Do not break individual thread bindings during x86 backend discovery in a
multithreaded process. Thanks to Farouk Mansouri for the report.
+ Fix hwloc-bind --membind for CPU-less NUMA nodes.
+ Fix some corner cases in the XML export/import of application userdata.
* API Improvements
+ Add HWLOC_MEMBIND_BYNODESET flag so that membind() functions accept
either cpusets or nodesets.
+ Add hwloc_get_area_memlocation() to check where pages are actually
allocated. Only implemented on Linux for now.
- There's no _nodeset() variant, but the new flag HWLOC_MEMBIND_BYNODESET
is supported.
+ Make hwloc_obj_type_sscanf() parse back everything that may be outputted
by hwloc_obj_type_snprintf().
* Detection Improvements
+ Allow the x86 backend to add missing cache levels, so that it completes
what the Solaris backend lacks.
Thanks to Ryan Zezeski for reporting the issue.
+ Do not filter-out FibreChannel PCI adapters by default anymore.
Thanks to Matt Muggeridge for the report.
+ Add support for CUDA compute capability 6.x.
* Tools
+ Add --support to hwloc-info to list supported features, just like with
hwloc_topology_get_support().
- Also add --objects and --topology to explicitly switch between the
default modes.
+ Add --tid to let hwloc-bind operate on individual threads on Linux.
+ Add --nodeset to let hwloc-bind report memory binding as NUMA node sets.
+ hwloc-annotate and lstopo don't drop application userdata from XMLs anymore.
- Add --cu to hwloc-annotate to drop these application userdata.
+ Make the hwloc-dump-hwdata dump directory configurable through configure
options such as --runstatedir or --localstatedir.
* Misc Improvements
+ Add systemd service template contrib/systemd/hwloc-dump-hwdata.service
for launching hwloc-dump-hwdata at boot on Linux.
Thanks to Grzegorz Andrejczuk.
+ Add HWLOC_PLUGINS_BLACKLIST environment variable to prevent some plugins
from being loaded. Thanks to Alexandre Denis for the suggestion.
+ Small improvements for various Windows build systems,
thanks to Jonathan L Peyton and Marco Atzeri.
Version 1.11.2
--------------
* Improve support for Intel Knights Landing Xeon Phi on Linux:
+ Group local NUMA nodes of normal memory (DDR) and high-bandwidth memory
(MCDRAM) together through "Cluster" groups so that the local MCDRAM is
easy to find.
- See "How do I find the local MCDRAM NUMA node on Intel Knights
Landing Xeon Phi?" in the documentation.
- For uniformity across all KNL configurations, always have a NUMA node
object even if the host is UMA.
+ Fix the detection of the memory-side cache:
- Add the hwloc-dump-hwdata superuser utility to dump SMBIOS information
into /var/run/hwloc/ as root during boot, and load this dumped
information from the hwloc library at runtime.
- See "Why do I need hwloc-dump-hwdata for caches on Intel Knights
Landing Xeon Phi?" in the documentation.
Thanks to Grzegorz Andrejczuk for the patches and for the help.
* The x86 and linux backends may now be combined for discovering CPUs
through x86 CPUID and memory from the Linux kernel.
This is useful for working around buggy CPU information reported by Linux
(for instance the AMD Bulldozer/Piledriver bug below).
Combination is enabled by passing HWLOC_COMPONENTS=x86 in the environment.
* Fix L3 cache sharing on AMD Opteron 63xx (Piledriver) and 62xx (Bulldozer)
in the x86 backend. Thanks to many users who helped.
* Fix the overzealous L3 cache sharing fix added to the x86 backend in 1.11.1
for AMD Opteron 61xx (Magny-Cours) processors.
* The x86 backend may now add the info attribute Inclusive=0 or 1 to caches
it discovers, or to caches discovered by other backends earlier.
Thanks to Guillaume Beauchamp for the patch.
* Fix the management on alloc_membind() allocation failures on AIX, HP-UX
and OSF/Tru64.
* Fix spurious failures to load with ENOMEM on AIX in case of Misc objects
below PUs.
* lstopo improvements in X11 and Windows graphical mode:
+ Add + - f 1 shortcuts to manually zoom-in, zoom-out, reset the scale,
or fit the entire window.
+ Display all keyboard shortcuts in the console.
* Debug messages may be disabled at runtime by passing HWLOC_DEBUG_VERBOSE=0
in the environment when --enable-debug was passed to configure.
* Add a FAQ entry "What are these Group objects in my topology?".
Version 1.11.1
--------------
* Detection fixes
+ Hardwire the topology of Fujitsu K-computer, FX10, FX100 servers to
workaround buggy Linux kernels.
Thanks to Takahiro Kawashima and Gilles Gouaillardet.
+ Fix L3 cache information on AMD Opteron 61xx Magny-Cours processors
in the x86 backend. Thanks to Guillaume Beauchamp for the patch.
+ Detect block devices directly attached to PCI without a controller,
for instance NVMe disks. Thanks to Barry M. Tannenbaum.
+ Add the PCISlot attribute to all PCI functions instead of only the
first one.
* Miscellaneous internal fixes
+ Ignore PCI bridges that could fail assertions by reporting buggy
secondary-subordinate bus numbers