PANDAS I
How to set Advanced parameters.
Please also refer to:
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Specify file-path of input and output files. If you want give Title to the HTML report, put the Title here. If you wan to analyze a certain range from the data file specify here.
When reading Excel file, specify sheet name.
- Defaults to 0: 1st sheet as a DataFrame
- 1: 2nd sheet as a DataFrame
- "Sheet1": Load sheet with name “Sheet1”
- [0, 1, "Sheet5"]: Load first, second and sheet named “Sheet5” as a dict of DataFrame
- None: All sheets.
3. Specify what row you have the headers: Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex. Use None if there is no header.
4. Column (0-indexed) to use as the row labels of the DataFrame. Pass None if there is no such column. If a list is passed, those columns will be combined into a MultiIndex. If a subset of data is selected with usecols, index_col is based on the subset.
5. If None, then parse all columns.
- If str, then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides.
- If list of int, then indicates list of column numbers to be parsed.
- If list of string, then indicates list of column names to be parsed.
6. Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32} Use object to preserve data as stored in Excel and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.
7. Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python’s builtin sniffer tool, csv.Sniffer. In addition, separators longer than 1 character and different from '\s+' will be interpreted as regular expressions and will also force the use of the Python parsing engine. Note that regex delimiters are prone to ignoring quoted data. Regex example: '\r\t'.
8. Return Value stores the complete file-path of “Out File”.
Functions supported. (Execution sequence must be in the order listed below.)
- Extraction of rows that meet logical criteria (Filtering).
- Replacing of columns.
- Adding new columns.
- Dropping (removing) of columns.
- Selection of rows and columns by position.
Command Examples
1) Extraction of rows that meet logical criteria (Filtering).
For Column "Qty" filter out less then 500 |
df['Qty'] >= 500.0 |
For Column "Item ID" filter only starts with “The”. Function “contains” take the parameter a regular expression. |
df['Item ID'].str.contains('^The') |
regex (Regular Expressions) Examples | |
---|---|
'\.' | Matches strings containing a period '.' |
'Length$' | Matches strings ending with word 'Length' |
'^Sepal' | Matches strings beginning with the word 'Sepal' |
'^x[1-5]$' | Matches strings beginning with 'x' and ending with 1,2,3,4,5 |
'^(?!Species$).*' | Matches strings except the string 'Species' |
For logical AND filter |
(df['Qty'] >= 500.0) & (df['Item ID'].str.contains('^The')) |
Logic in Python (and pandas) | |||
---|---|---|---|
< | Less than | != | Not equal to |
> | Greater than | df.column.isin(values) | Group membership |
== | Equals | pd.isnull(obj) | Is not NaN |
<= | Less than or equals | pd.notnull(obj) | Is not NaN |
>= | Greater than or equals | $, |, ~, ^, df.any(), de.all() | Logical and, or, not, xor, any, all |
2) Replacing of columns.
Replace "A " with "B" for column "Col" |
Col ::= 'A','B' |
3) Adding new columns.
Extract first word from “Col” column and then add it to the new column “NewCol” |
NewCol=lambda x: x['Col'].str.extract(r'^(\w+)') |
4) Dropping (removing) of columns.
Delete one or more columns |
‘Length’, ’Hight’ |
5) Selection of rows and columns by position.
Select between 3rd and 5th rows and 1st and 3rd columns |
2:5, 0:3 |
NOTE
Execution sequence of the functions must be in the order listed above within “one run” of the pandas I plugin. If you require to execute functions out of this order, please use multiple pandas I plugin in back to back. (In a daisy chain way)
How to set parameters – actual examples.
About Jupyter Notebook --- Before building your automation with the pandas I plugin, we strongly recommend you to take your data and test your commands on Jupyter Notebook https://jupyter.org/. You can use the commands from Jupyter Notebook directly at the pandas I plugin.
- Example 01
Jupyter Notebook.pdf
- Example 02
Pandas CheatSheet.pdf
- ABBYY Download
- ABBYY Status
- ABBYY Upload
- AD LDAP
- Adv Send Email
- API Requests
- ARGOS API
- Arithmetic Op
- ASCII Converter
- Attach Image
- AWS S3
- AWS Textra Rekog
- Base64
- Basic Numerical Operations
- Basic String Manipulation
- Bot Collabo
- Box
- Box II
- Chatwork GetMessage
- Chatwork Notification
- Citizen Log
- Clipboard
- Codat API
- Convert CharSet
- Convert Image
- Convert Image II
- Create Newfile
- CSV2XLSX
- Dashboard Api
- DashBord Api
- Data Plot I
- Date OP
- DeepL Free
- Detect CharSet
- Dialog Calendar
- Dialog Error
- Dialog File Selection
- Dialog Forms
- Dialog Info
- Dialog Password
- Dialog Question
- Dialog Text Entry
- Dialog Text Info
- Dialog Warning
- DirectCloud API
- Doc2TXT
- DocDigitizer Get Doc
- DocDigitizer Tracking
- DocDigitizer Upload
- Drag and Drop
- Dropbox
- Dynamic Python
- Email IMAP ReadMon
- Email Read Mon
- Env Check
- Env Var
- Excel2Image
- Excel Advanced
- Excel Advance IV
- Excel AdvII
- Excel AdvIII
- Excel Copy Paste
- Excel Formula
- Excel Large Files
- Excel Macro
- Excel Newfile
- Excel Simple Read
- Excel Simple Write
- Excel Style
- Excel Update
- Fairy Devices mimi AI
- File Conv
- File Downloader
- File Folder Exists
- File Folder Op
- File Status
- Fixed Form Processing
- Floating Form Processing
- Folder Monitor
- Folder Status
- Folder Structure
- FTP Server
- Git HTML Extract
- Google Calendar
- Google Cloud Vision API
- Google Drive
- Google Search API
- Google Sheets
- Google Token
- Google Translate
- Google TTS
- GraphQL API
- Html Extract
- HTML Table
- IBM Speech to Text
- IBM Visual Recognition
- Java UI Automation
- JP Holiday
- JSON Select
- JSON to from CSV
- Lazarus Forms
- Lazarus Grid
- Lazarus Invoices
- Lazarus RikAI
- Lazarus RikAI2
- Lazarus RikAI2 Async
- Lazarus Riky
- Lazarus VKG
- LINE ID Card OCR
- LINE Notify
- LINE Receipt OCR
- Mangdoc AI Docs
- Microsoft Teams
- MongoDB
- MQTT Publisher
- MS Azure Text Analytics
- MS Word Extract
- NAVER OCR
- Newuser-SFDC
- OCI
- OCR PreProcess
- OpenAI API
- Oracle SQL
- Outlook
- Outlook Email
- PANDAS I
- pandas II
- pandas III
- PANDAS profiling
- Parsehub
- Password Generate
- Path Manipulation
- PDF2Doc
- PDF2Table
- PDF2TXT
- PDF Miner
- PDF SplitMerge
- PDF Viewer(Start/Stop)
- PostgreSQL
- PowerShell
- PPTX Template
- Print 2 Image
- Python Selenium
- QR Generate
- QR Read
- RakurakuHanbai API
- Regression
- Rename File
- REST API
- Rossum
- Running GAS
- Scrapy Basic
- Screen Capture
- Screen Recording START
- Screen Recording STOP
- Screen Snipping
- Seaborn Plot
- SharePoint
- Simple Counter
- Simple SFDC
- Slack
- Sort CSV
- Speed Test
- SQL
- SQLite
- SSH Command
- SSH Copy
- String Manipulation
- String Similarity
- Svc Check
- Sys Info
- Telegram
- Tesseract
- Text2PDF
- Text2Word
- Text Read
- Text Write
- Time Diff
- Time Stamp
- Web Extract
- Windows Op
- Windows Screen Lock
- Win UI Control
- Win UI Text
- Word2PDF
- Word2TXT
- Word Editor
- Work Calendar
- XML Extract
- XML Manipulation
- Xtracta Get Doc
- Xtracta Tracking
- Xtracta Upload
- YouTube Operation
- ZipUnzip