Batch Process Drawings¶
Imagine a scenario when you are processing hundreds of engineering drawings from an old archive. You would like to process them all at once, Werk24 allows you to process multiple files at the same time. All you need to do is follow the steps in this tutorial.
What will I learn?¶
- Collectively processing multiple technical drawings.
- Avoiding loss of data due to same file names. For example:
file_name.pdf
andfile_name.png
could be two differrent file. This tutorial will allow you to process multiple files with same name.
Step 1: Import all the requirements¶
Let's get imports out of the way at the very beginning.
import asyncio
import glob
import json
import os
from functools import partial
from pathlib import Path
from werk24 import Hook, W24AskTitleBlock
from werk24.models.techread import W24TechreadMessage
from werk24.utils import w24_read_async
Step 2: Define folder paths¶
There are many ways of defining the input and output paths. For pure simplicity, we choose to define these paths as constants. Please update these paths in accordance with your own setup.
INPUT_FOLDER = Path("/abs/path/to/input/")
OUTOUT_FOLDER = Path("/abs/path/to/output")
Step 3: Obtain and process all files in the input folder¶
Our first functional code snippet takes a path as input, locates all files in the given folder and processes them.
async def _process_folder(path: Path) -> None:
""" Process all files in the provided path
Args:
path (str): Absolute or relative path of the folder
to process
"""
files = [
f
for f in glob.glob(path/"*")
if os.path.isfile(f)
]
for filename in files:
await _process_file(filename)
Step 4: Process individual file¶
Next, we need to define what "process this file" should mean. This function loads the drawing bytes from the absolute path, defines what information shall be extracted, what function shall be called when the information is available and start the process.
async def _process_file(path: str) -> None:
""" Process the file located at `path`
Args:
path (str): Path of the file
"""
with open(path, 'rb') as fid:
drawing_bytes = fid.read()
hooks = [
Hook(
ask=W24AskTitleBlock(),
function=partial(_store_response, input_path=path)
)
]
await w24_read_async(drawing_bytes, hooks)
Step 5: Store the response in a file¶
The last step calls a function _store_response
whenever the API returns information. This _store_response
function stores can be a synchronous or an asynchronous function. The Werk24 framework will call this function accordingly. As this function is IO-heavy, we are opting for an asynchronous function.
async def _store_response(
message: W24TechreadMessage,
input_path: str
) -> None:
""" Store the API response in a file
Args:
message (W24TechreadMessage): Message was we received
if from Werk24
filename (str): Filename of the original file that
was processed. This will automatically be converted
to an output filename
"""
output_filename = _make_output_path(input_path)
with open(output_filename, "w+") as file:
print(json.dumps(message.payload_dict, file, indent=6))
Step 6: Convert the input filename into output path¶
The last step converted the input_path
into an output_filename
. We still need to define this function. To ensure that previously written information is not overwritten, this function will check whether the file name exists already and find a path that is new.
def _make_output_path(
input_path: Path,
max_duplicates: int = 100
) -> Path:
""" Convert an input path into an output path
Args:
input_filename (Path): filename of the original
file
Returns:
Path: output path
Raises:
RuntimeError: raised when we could not generate an
output path
"""
basepath = os.path.basename(input_path)
parts = os.path.splitext(basepath)
# return the simple path if it does not exist yet
path = OUTOUT_FOLDER / f"{parts[0]}.json"
if not os.path.exists(path):
return path
# otherwise try several times to find a path that
# does not exist yet
for i in range(max_duplicates):
path = OUTOUT_FOLDER / f"{parts[0]}({i+1}).json"
if not os.path.exists(path):
return path
raise RuntimeError(
f"Could not generate a new output filename for `{input_path}`")
Step 7: Call the main function¶
Done - the last thing that remains is calling the function.
if __name__ == "__main__":
asyncio.run(_process_folder(INPUT_FOLDER))