Getting Started
mlky is recommended to be installed via pip at this time.
Installation
You can install mlky using pip:
pip install mlky
Or via Conda:
conda install -c jammont mlky
Quick Overview
To get started with mlky, import the Config
object and pass it either a yaml file, yaml string, or a dict:
>>> from mlky import Config
>>> # Empty initially
>>> Config
D{}
>>> # Now initialized
>>> Config({'A': {'a': 1, 'b': 2}, 'B': {'a': 0, 'c': 3}, 'C': ['d', 'e']})
D{'A': D{'a': V=1, 'b': V=2}, 'B': D{'a': V=0, 'c': V=3}, 'C': L[V='d', V='e']}
The object uses tags to represent what each part is:
D{...}
fordict
objectsL[...]
forlist
objectsV=...
for variable objects
These can be accessed by either dot and dict notation:
>>> Config.A
D{'a': V=1, 'b': V=2}
>>> Config['B']
D{'a': V=0, 'c': V=3}
>>> Config.A.a
1
>>> Config['B']['a']
0
>>> Config['A'].b
2
>>> Config.B['c']
3
The Config
object is also a singleton, though copies can be created to create local versions:
def set_param(key, value, copy=False):
if copy:
config = Config.deepCopy()
else:
config = Config()
config[key] = value
def get_param(key, copy=False):
if copy:
config = Config.deepCopy()
else:
config = Config()
return config[key]
>>> set_param('persist', True) # Global
>>> get_param('persist') # Global
True
>>> set_param('local', True, copy=True)
>>> get_param('local')
Null
>>> get_param('persist', copy=True) # Copies global instance
True
Because it is a singleton, you can also use Config
directly instead of a variable as well as use the object across the Python instance:
# Script 1
from mlky import Config
Config(a=1, b=2) # initialize somewhere
# Script 2
from mlky import Config
assert Config.a == 1
assert Config.b == 2
Ideally you would want to initialize the Config
object at the beginning and then leverage the global instance:
from mlky import Config
def process(item):
if Config.param:
...
def main():
for item in Config.process:
process(item)
if __name__ == '__main__':
Config('/some/config.yaml')
main()
Detailed Walkthrough
The following will be used as an example:
from glob import glob
from mlky import Config
def process(files):
for file in files:
with open(file, 'r') as f:
lines = f.readlines()
if Config.skip_header:
lines = lines[Config.header:]
... # Some arbitrary other processing code
if Config.output:
with open(Config.output.file, 'a') as f:
f.writelines(lines)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-c', '--config', required=True)
parser.add_argument('-p', '--patch')
args = parser.parse_args()
Config(args.config, args.patch)
if Config.get('input'):
files = glob(f'{Config.input}/*')
process(files)
else:
print('Error: No input provided!')
Command: python script.py -c /some/config.yml -p "sect1<-sect2"
Calling the script with the above command will step through:
- Initialize the global
Config
instance with the file and patch provided a. That is,sect1
in theconfig.yml
will be patched withsect2
- Check if
Config.input
exists a. If it is in the Config, return it as-is. If it is not, this will be aNull
value which evaluates toFalse
- Use the value at
Config.input
to glob some directory - Process the collected files
- For each file, read in the data
- If
Config.skip_header
is defined, use the (expected to be an int) value ofConfig.header
a. It is on the user to ensure proper safeguards are inplace.
b. Multiple possible safeguards include:int(Config.header)
to raise an exception if the value cannot be casted to an integerConfig.get('header', 5)
to use a default value if this key is not in the config- A definitions file to ensure this key is an int (safest)
- Check if
Config.output
is defined, which is expected to be a Sect - Append write data to
Config.output.file