Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifications on the pmbr_boot disk flag #88

Open
tommy-skaug opened this issue Mar 29, 2024 · 4 comments
Open

Clarifications on the pmbr_boot disk flag #88

tommy-skaug opened this issue Mar 29, 2024 · 4 comments

Comments

@tommy-skaug
Copy link

In reference to: siderolabs/talos#7066 (comment) and this code block

// References:
// - https://en.wikipedia.org/wiki/GUID_Partition_Table#Protective_MBR_(LBA_0)
// - https://www.syslinux.org/wiki/index.php?title=Doc/gpt
// - https://en.wikipedia.org/wiki/Master_boot_record
// - http://www.rodsbooks.com/gdisk/bios.html
func (g *GPT) newPMBR(h *Header) ([]byte, error) {
p, err := g.l.ReadAt(0, 0, 512)
if err != nil {
return nil, err
}
// Boot signature.
copy(p[510:], []byte{0x55, 0xaa})
// PMBR protective entry.
b := p[446 : 446+16]
if g.markMBRBootable {
// Some BIOSes in legacy mode won't boot from a disk unless there is at least one
// partition in the MBR marked bootable. Mark this partition as bootable.
b[0] = 0x80
} else {
b[0] = 0x00
}
// Partition type: EFI data partition.
b[4] = 0xee
// CHS for the start of the partition
copy(b[1:4], []byte{0x00, 0x02, 0x00})
// CHS for the end of the partition
copy(b[5:8], []byte{0xff, 0xff, 0xff})
// Partition start LBA.
binary.LittleEndian.PutUint32(b[8:12], 1)
// Partition length in sectors.
// This might overflow uint32, so check accordingly
if h.BackupLBA > math.MaxUint32 {
binary.LittleEndian.PutUint32(b[12:16], uint32(math.MaxUint32))
} else {
binary.LittleEndian.PutUint32(b[12:16], uint32(h.BackupLBA))
}
return p, nil
}

By some reason my disks were marked with the pmbr_boot by the partitioner in what I believe is regardless of the legacyBIOSSupport setting in the Talos config. Another machine I was setting up did not have the flag. All disks were wiped with dd before install.

As I read the code it seems like it is first and foremost about setting the boot partition active and not about the pmbr flag on the disk? Further, does that mean pmbr_boot is in theory set in any case?

Example partition table from one of the installs (caveat this is with an installer that sets the EFI partition to 500MB):

/ # parted /dev/nvme0n1 p
Warning: Not all of the space available to /dev/nvme0n1 appears to be used, you can fix the GPT to use all
of the space (an extra 28 blocks) or continue with the current setting?
Fix/Ignore? i
Model: APPLE SSD AP0512M (nvme)
Disk /dev/nvme0n1: 500GB
Sector size (logical/physical): 4096B/4096B
Partition Table: gpt
Disk Flags: pmbr_boot

Number  Start   End     Size    File system  Name       Flags
 1      1049kB  525MB   524MB   fat32        EFI        boot, esp
 2      525MB   526MB   1049kB               BIOS       bios_grub, legacy_boot
 3      526MB   1575MB  1049MB  xfs          BOOT
 4      1575MB  1576MB  1049kB               META
 5      1576MB  1681MB  105MB   xfs          STATE
 6      1681MB  500GB   499GB   xfs          EPHEMERAL

In effect, at least on the Mac Mini and for devices following the spec, this leads to the device refusing to boot from EFI, which was the case for me. Removing the flag made the device boot like it should.

So this may in other words be a case that requires clarification in the docs or a bug.

I was also wondering in what cases the disk flags is applied (e.g. when a reset is run)?

@smira
Copy link
Member

smira commented Apr 1, 2024

I can't reproduce that, fresh Talos install doesn't set the flag:

parted ~/.talos/clusters/talos-default/talos-default-controlplane-1-0.disk p
WARNING: You are not superuser.  Watch out for permissions.
Warning: Unable to open /scratch/talos/clusters/talos-default/talos-default-controlplane-1-0.disk read-write (Permission denied).
/scratch/talos/clusters/talos-default/talos-default-controlplane-1-0.disk has been opened read-only.
Model:  (file)                                                            
Disk /scratch/talos/clusters/talos-default/talos-default-controlplane-1-0.disk: 6442MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name       Flags
 1      1049kB  106MB   105MB   fat32        EFI        boot, esp
 2      106MB   107MB   1049kB               BIOS       bios_grub, legacy_boot
 3      107MB   1156MB  1049MB  xfs          BOOT
 4      1156MB  1157MB  1049kB               META
 5      1157MB  1261MB  105MB   xfs          STATE
 6      1261MB  6441MB  5180MB  xfs          EPHEMERAL

@tommy-skaug
Copy link
Author

tommy-skaug commented Apr 2, 2024

@smira thanks for the follow-up. I may be getting confused by the function name here. Do I understand you correctly that the pmbr_boot flag shouldn't be set by the partitioner generally and regardless of the legacyBIOSSupport config setting in Talos?

@smira
Copy link
Member

smira commented Apr 2, 2024

Yes, it should not set it unless explicitly enabled, but it would preserve the flag value if it's already set.

@tommy-skaug
Copy link
Author

Got it. In that case I'm out of explanations to why the disks on the 6 machines ended up this way. They all booted nicely around 1.5.4, were dd-ed and then only Talos with OS upgrades has been running on them (with a mod for the partition size). With that in mind, I'm happy to close the ticket since I haven't been able to reproduce the issue myself either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants