You can extract tables from documents by creating a "Table Data" or "Line Items" parsing rule. The short video below gives a quick overview, and a detailed step-by-step guide follows further down the page.
💡 By the end of this guide you’ll be able to:
- Create a table parsing rule
- Define column boundaries
- Filter rows using “Keep Rows Where”
|
|
1) Create a New Parsing Rule
Go to the Rules page in your parser and click on Add Parsing Rule or Create First Parsing Rule.
2) Select the Rule Type/Pre-set
Next, you will see a number of parsing rule types, designed to help create specific rules. Click or search for "Table Data", "Line Items", or "Smart Tables".
3) Create Column Dividers
Define where each of our columns are! To do this, simply click and drag on the red lines to roughly match where your columns are.
Column dividers can also be added by clicking on the red bars on either side of the screen or removed by clicking on the trash icon at the top or bottom of the column divider.
Once you are happy, click "Confirm" to proceed!
|
|
4) Refine - Using Filters
On this page you will see the raw output of your document. From here we can refine the results to output only the data you need.
The next step is to refine the output so only the rows you need remain. Look for a column that follows a consistent pattern (for example, amounts or decimals).
With many types of documents, we can do this by simply looking for data in a table that shares a similar pattern. The below example shows where we have selected column 4, specified that column 4 is an amount, and then the system has returned only rows where this is true.
|
|
Your parsing rule will now extract only the rows that match this condition. You now have a parsing rule that can search your document for this data each time and only return the data you need!
Try selecting different data types and columns to match your data.
Docparser comes with hundreds of other filters for helping to refine both simple and complex documents. Check out some of the following articles to learn more on this!
- Group and Merge Table Rows
- Identify Table Subsections
- How to parse tables with complex layouts?
- How to extract floating tables which do not have a fixed position?