报错：AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap' #52

Macaron-Lawrence · 2023-10-30T10:28:04Z

用的是云显卡，NVIDIA GeForce RTX 4090。

环境如下：
PyTorch 1.10.0
Python 3.8(ubuntu20.04)
Cuda 11.3

pytorch-lightning==1.9.2
torch==1.13.1
deepspeed==0.7.0

控制台如下：

python train.py \
>     --load_model "/rwkv/RWKV-4-World-CHNtuned-1.5B-v1-20230620-ctx4096.pth" \
>     --proj_dir "/rwkv/output" \
>     --data_file "/rwkv/binidx/mission_text_document" \
>     --data_type binidx \
>     --vocab_size 50277 \
>     --ctx_len 1024 \
>     --accumulate_grad_batches 8 \
>     --epoch_steps 200 \
>     --epoch_count 20 \
>     --epoch_begin 0 \
>     --epoch_save 2 \
>     --micro_bsz 8 \
>     --n_layer 24 \
>     --n_embd 2048 \
>     --pre_ffn 0 \
>     --head_qk 0 \
>     --lr_init 1e-5 \
>     --lr_final 1e-5 \
>     --warmup_steps 0 \
>     --beta1 0.9 \
>     --beta2 0.999 \
>     --adam_eps 1e-8 \
>     --accelerator gpu \
>     --devices 1 \
>     --precision bf16 \
>     --strategy deepspeed_stage_2 \
>     --grad_cp 1 \
>     --lora \
>     --lora_r 8 \
>     --lora_alpha 32 \
>     --lora_dropout 0.01 \
>     --lora_parts=att,ffn,time,ln
########## work in progress ##########

############################################################################
#
# RWKV-4 BF16 on 1x1 GPU, bsz 1x1x8=8, deepspeed_stage_2 with grad_cp
#
# Data = /rwkv/binidx/mission_text_document (binidx), ProjDir = /rwkv/output
#
# Epoch = 0 to 19 (will continue afterwards), save every 2 epoch
#
# Each "epoch" = 200 steps, 1600 samples, 1638400 tokens
#
# Model = 24 n_layer, 2048 n_embd, 1024 ctx_len
# LoRA = enabled, 8 r, 32.0 alpha, 0.01 dropout, on att,ffn,time,ln
#
# Adam = lr 1e-05 to 1e-05, warmup 0 steps, beta (0.9, 0.999), eps 1e-08
#
# Found torch 1.13.1+cu117, recommend 1.13.1+cu117 or newer
# Found deepspeed 0.7.0, recommend 0.7.0 (faster than newer versions)
# Found pytorch_lightning 1.9.2, recommend 1.9.1 or newer
#
############################################################################

{'load_model': '/rwkv/RWKV-4-World-CHNtuned-1.5B-v1-20230620-ctx4096.pth', 'wandb': '', 'proj_dir': '/rwkv/output', 'random_seed': -1, 'data_file': '/rwkv/binidx/mission_text_document', 'data_type': 'binidx', 'vocab_size': 50277, 'ctx_len': 1024, 'epoch_steps': 200, 'epoch_count': 20, 'epoch_begin': 0, 'epoch_save': 2, 'micro_bsz': 8, 'n_layer': 24, 'n_embd': 2048, 'dim_att': 2048, 'dim_ffn': 8192, 'pre_ffn': 0, 'head_qk': 0, 'tiny_att_dim': 0, 'tiny_att_layer': -999, 'lr_init': 1e-05, 'lr_final': 1e-05, 'warmup_steps': 0, 'beta1': 0.9, 'beta2': 0.999, 'adam_eps': 1e-08, 'grad_cp': 1, 'my_pile_stage': 0, 'my_pile_shift': -1, 'my_pile_edecay': 0, 'layerwise_lr': 1, 'ds_bucket_mb': 200, 'my_img_version': 0, 'my_img_size': 0, 'my_img_bit': 0, 'my_img_clip': 'x', 'my_img_clip_scale': 1, 'my_img_l1_scale': 0, 'my_img_encoder': 'x', 'my_sample_len': 0, 'my_ffn_shift': 1, 'my_att_shift': 1, 'my_pos_emb': 0, 'load_partial': 0, 'magic_prime': 0, 'my_qa_mask': 0, 'my_testing': '', 'lora': True, 'lora_load': '', 'lora_r': 8, 'lora_alpha': 32.0, 'lora_dropout': 0.01, 'lora_parts': 'att,ffn,time,ln', 'logger': False, 'enable_checkpointing': False, 'default_root_dir': None, 'gradient_clip_val': 1.0, 'gradient_clip_algorithm': None, 'num_nodes': 1, 'num_processes': None, 'devices': '1', 'gpus': None, 'auto_select_gpus': None, 'tpu_cores': None, 'ipus': None, 'enable_progress_bar': True, 'overfit_batches': 0.0, 'track_grad_norm': -1, 'check_val_every_n_epoch': 100000000000000000000, 'fast_dev_run': False, 'accumulate_grad_batches': 8, 'max_epochs': -1, 'min_epochs': None, 'max_steps': -1, 'min_steps': None, 'max_time': None, 'limit_train_batches': None, 'limit_val_batches': None, 'limit_test_batches': None, 'limit_predict_batches': None, 'val_check_interval': None, 'log_every_n_steps': 100000000000000000000, 'accelerator': 'gpu', 'strategy': 'deepspeed_stage_2', 'sync_batchnorm': False, 'precision': 'bf16', 'enable_model_summary': True, 'num_sanity_val_steps': 0, 'resume_from_checkpoint': None, 'profiler': None, 'benchmark': None, 'reload_dataloaders_every_n_epochs': 0, 'auto_lr_find': False, 'replace_sampler_ddp': False, 'detect_anomaly': False, 'auto_scale_batch_size': False, 'plugins': None, 'amp_backend': None, 'amp_level': None, 'move_metrics_to_cpu': False, 'multiple_trainloader_mode': 'max_size_cycle', 'inference_mode': True, 'my_timestamp': '2023-10-30-18-23-23', 'betas': (0.9, 0.999), 'real_bsz': 8, 'run_name': '50277 ctx1024 L24 D2048'}

!!!!! LoRA Warning: Gradient Checkpointing requires JIT off, disabling it
RWKV_MY_TESTING 
Using /root/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py38_cu117/wkv_1024_bf16/build.ninja...
Building extension module wkv_1024_bf16...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module wkv_1024_bf16...
Current vocab size = 50277 (make sure it's correct)
Traceback (most recent call last):
  File "train.py", line 284, in <module>
    train_data = MyDataset(args)
  File "/root/rwkv/RWKV-LM-LoRA/RWKV-v4neo/src/dataset.py", line 31, in __init__
    self.data = MMapIndexedDataset(args.data_file)
  File "/root/rwkv/RWKV-LM-LoRA/RWKV-v4neo/src/binidx.py", line 179, in __init__
    self._do_init(path, skip_warmup)
  File "/root/rwkv/RWKV-LM-LoRA/RWKV-v4neo/src/binidx.py", line 189, in _do_init
    self._index = self.Index(index_file_path(self._path), skip_warmup)
  File "/root/rwkv/RWKV-LM-LoRA/RWKV-v4neo/src/binidx.py", line 105, in __init__
    with open(path, "rb") as stream:
FileNotFoundError: [Errno 2] No such file or directory: '/rwkv/binidx/mission_text_document.idx'
Exception ignored in: <function MMapIndexedDataset.Index.__del__ at 0x7fc7c90c4790>
Traceback (most recent call last):
  File "/root/rwkv/RWKV-LM-LoRA/RWKV-v4neo/src/binidx.py", line 150, in __del__
    self._bin_buffer_mmap._mmap.close()
AttributeError: 'Index' object has no attribute '_bin_buffer_mmap'
Exception ignored in: <function MMapIndexedDataset.__del__ at 0x7fc7c90c4d30>
Traceback (most recent call last):
  File "/root/rwkv/RWKV-LM-LoRA/RWKV-v4neo/src/binidx.py", line 202, in __del__
    self._bin_buffer_mmap._mmap.close()
AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap'

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

报错：AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap' #52

报错：AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap' #52

Macaron-Lawrence commented Oct 30, 2023 •

edited

Loading

报错：AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap' #52

报错：AttributeError: 'MMapIndexedDataset' object has no attribute '_bin_buffer_mmap' #52

Comments

Macaron-Lawrence commented Oct 30, 2023 • edited Loading

Macaron-Lawrence commented Oct 30, 2023 •

edited

Loading