Extract the original date field
Parsing a date field from your PDF documents is easy with Docparser. All you need to do is to create a new parsing rule and select 'Date Field' in the first step. In the second step you can loosely narrow down the area where you expect the date field to be with our visual selection tool.
Apply further formatting to the extracted date field
Finally, the third step is all about formatting your date so that the parsed data looks the way you need it. Formatting parsed dates is done by proving a string which describes the format of the string.
Date patterns are described with the same syntax used by the PHP date function. For example if you would want to turn "Tuesday, 29th August 2015" into "2015-08-29", you can set the "Y-m-d" as the output pattern. Below is list of all available date formatting patterns.
Day Formatting Syntax
d | Day of the month, 2 digits with leading zeros | 01 to 31 |
D | A textual representation of a day, three letters | Mon through Sun |
j | Day of the month without leading zeros | 1 to 31 |
l (lowercase 'L') | A full textual representation of the day of the week | Sunday through Saturday |
S | English ordinal suffix for the day of the month, 2 characters | st, nd, rd or th. Works well with j |
Week Formatting Syntax
W | ISO-8601 week number of year, weeks starting on Monday | Example: 42 (the 42nd week in the year) |
Month Formatting Syntax
F | A full textual representation of a month, such as January or March | January through December |
m | Numeric representation of a month, with leading zeros | 01 through 12 |
M | A short textual representation of a month, three letters | Jan through Dec |
n | Numeric representation of a month, without leading zeros | 1 through 12 |
Year Formatting Syntax
Y | A full numeric representation of a year, 4 digits | Examples: 1999 or 2003 |
y | A two digit representation of a year | Examples: 99 or 03 |
Time Formatting Syntax
a | Lowercase Ante meridiem and Post meridiem | am or pm |
A | Uppercase Ante meridiem and Post meridiem | AM or PM |
B | Swatch Internet time | 000 through 999 |
g | 12-hour format of an hour without leading zeros | 1 through 12 |
G | 24-hour format of an hour without leading zeros | 0 through 23 |
h | 12-hour format of an hour with leading zeros | 01 through 12 |
H | 24-hour format of an hour with leading zeros | 00 through 23 |
i | Minutes with leading zeros | 00 to 59 |
s | Seconds, with leading zeros | 00 through 59 |
Other Formatting Options
timestamp | Returns the linux timestamp (seconds since 1st Jan 1970) | 1449237286 |
How to Format a Date: