Hello neural forecast community, I ran into an iss...
# neural-forecast
v
Hello neural forecast community, I ran into an issue. It seems like when I ran the neuralforecast model on a multi-node cluster on Databricks. The worker nodes do not have permission to write logs to the workspace.
Copy code
An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
  File "/root/.ipykernel/95122/command-913200565561878-697464911", line 18, in test
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/sktime/forecasting/base/_base.py", line 391, in fit
    self._fit(y=y_inner, X=X_inner, fh=fh)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/sktime/forecasting/compose/_ensemble.py", line 161, in _fit
    self._fit_forecasters(forecasters, y_train, X_train, fh_test)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/sktime/forecasting/base/_meta.py", line 67, in _fit_forecasters
    self.forecasters_ = Parallel(n_jobs=self.n_jobs)(
  File "/databricks/python/lib/python3.10/site-packages/joblib/parallel.py", line 1110, in __call__
    self._pickle_cache = None
  File "/databricks/python/lib/python3.10/site-packages/joblib/parallel.py", line 901, in dispatch_one_batch
    self._dispatch(tasks)
  File "/databricks/python/lib/python3.10/site-packages/joblib/parallel.py", line 819, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/databricks/python/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/databricks/python/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 597, in __init__
    self.results = batch()
  File "/databricks/python/lib/python3.10/site-packages/joblib/parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "/databricks/python/lib/python3.10/site-packages/joblib/parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/sktime/forecasting/base/_meta.py", line 65, in _fit_forecaster
    return forecaster.fit(y, X, fh)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/sktime/forecasting/base/_base.py", line 391, in fit
    self._fit(y=y_inner, X=X_inner, fh=fh)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/sktime/forecasting/base/adapters/_neuralforecast.py", line 339, in _fit
    self._forecaster.fit(df=train_dataset, verbose=self.verbose_fit)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/neuralforecast/core.py", line 462, in fit
    self.models[i] = model.fit(
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/neuralforecast/common/_base_recurrent.py", line 530, in fit
    return self._fit(
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/neuralforecast/common/_base_model.py", line 219, in _fit
    trainer.fit(model, datamodule=datamodule)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 71, in _call_and_handle_interrupt
    raise
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 949, in _run
    call._call_setup_hook(self)  # allow user to set up LightningModule in accelerator environment
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 86, in _call_setup_hook
    if hasattr(logger, "experiment"):
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/lightning_fabric/loggers/logger.py", line 118, in experiment
    return fn(self)
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-a526db0a-d3cb-494b-952d-040b25c77273/lib/python3.10/site-packages/lightning_fabric/loggers/tensorboard.py", line 190, in experiment
    self._fs.makedirs(self.root_dir, exist_ok=True)
  File "/databricks/python/lib/python3.10/site-packages/fsspec/implementations/local.py", line 54, in makedirs
    os.makedirs(path, exist_ok=exist_ok)
  File "/usr/lib/python3.10/os.py", line 230, in makedirs
    raise
OSError: [Errno 30] Read-only file system: '/Workspace/Users/*********/Test/lightning_logs'
j
You can set the
logger
trainer argument to one of these or to
False
to disable logging. I'm not sure where sktime sets it but there should be something like
trainer_kwargs
v
@José Morales Oh yes! I do see sktime has a trainer_kwargs, so it would be something like trainer_kwargs = {"`TensorBoardLogger`":False} right?
j
trainer_kwargs={'logger': False}
v
Awesome! Let me try it then! Thank you!
👍 1