Skip to content

declearn.dataset.utils.save_data_array

Save a data array to a dump file.

Supported types of data arrays

  • pandas.DataFrame or pandas.Series: Dump to a comma-separated .csv file.
  • numpy.ndarray: Dump to a non-pickle .npy file.
  • scipy.sparse.spmatrix: Dump to a .sparse file, using a custom format and declearn.data.sparse.sparse_to_file.

Parameters:

Name Type Description Default
path str

Path to the file where to dump the array. Appropriate file extension will be added when not present (i.e. path may be a basename).

required
array Union[DataArray, pd.Series]

Data array that needs dumping to file. See above for supported types and associated behaviours.

required

Returns:

Name Type Description
path str

Path to the created file dump, based on the input path and the chosen file extension (see above).

Raises:

Type Description
TypeError

If array is of unsupported type.

Source code in declearn/dataset/utils/_save_load.py
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
def save_data_array(
    path: str,
    array: Union[DataArray, pd.Series],
) -> str:
    """Save a data array to a dump file.

    Supported types of data arrays
    ------------------------------
    - `pandas.DataFrame` or `pandas.Series`:
        Dump to a comma-separated `.csv` file.
    - `numpy.ndarray`:
        Dump to a non-pickle `.npy` file.
    - `scipy.sparse.spmatrix`:
        Dump to a `.sparse` file, using a custom format
        and `declearn.data.sparse.sparse_to_file`.

    Parameters
    ----------
    path: str
        Path to the file where to dump the array.
        Appropriate file extension will be added when
        not present (i.e. `path` may be a basename).
    array: data array structure (see above)
        Data array that needs dumping to file.
        See above for supported types and associated
        behaviours.

    Returns
    -------
    path: str
        Path to the created file dump, based on the input
        `path` and the chosen file extension (see above).

    Raises
    ------
    TypeError
        If `array` is of unsupported type.
    """
    # Select a file extension and set up the array-dumping function.
    if isinstance(array, (pd.DataFrame, pd.Series)):
        ext = ".csv"
        save = functools.partial(
            array.to_csv, sep=",", encoding="utf-8", index=False
        )
    elif isinstance(array, np.ndarray):
        ext = ".npy"
        save = functools.partial(np.save, arr=array)
    elif isinstance(array, spmatrix):
        ext = ".sparse"
        save = functools.partial(sparse_to_file, matrix=array)
    else:
        raise TypeError(f"Unsupported data array type: '{type(array)}'.")
    # Ensure proper naming. Save the array. Return the path.
    if not path.endswith(ext):
        path += ext
    os.makedirs(os.path.dirname(os.path.abspath(path)), exist_ok=True)
    save(path)
    return path