Welcome to ReQUIAM_csv’s documentation!¶

Research themes and organization mapping to work with figshare patron management
Overview¶
Constructs a mapping list between research themes (“portals”) and EDS/LDAP organization code to work with our Figshare patron management software (ReQUIAM). This code will generate a CSV file that is used for automation. The code imports a Google Sheet that is maintained by the Data Repository Team. The advantages of using Google Sheets are:
Ease of use (no need to format CSV)
Advanced spreadsheet capabilities with
MATCH()
, and permitting/prohibiting cells for modificationDocumentation capabilities via comments and version history management
Ability to grant access to University of Arizona Libraries staff for coordinated maintenance
With the above Google Sheet that is imported as a CSV file using
pandas
, it generates a CSV file called data/research_themes.csv
.
There are two versions of this file:
Trusted version,
master
: [raw] [rendered]Under developement,
develop
: [raw] [rendered]
The workflow describes how version control will be
conducted with these two different branches. In general, after a
maintainer implements a change to the Google Sheet, s/he will perform an
update to the develop
branch. Once that has been reviewed, a pull
request will be done to merge the changes into the master
branch.
Getting Started¶
These instructions will have the code running on your local or virtual machine.
Requirements¶
You will need the following to have a working copy of this software. See installation steps:
Installation Instructions¶
Python and setting up a conda
environment¶
First, install a working version of Python (>=3.7.9). We recommend using the Anaconda package installer.
After you have Anaconda installed, you will want to create a separate
conda
environment and activate it:
$ (sudo) conda create -n rsh_themes python=3.7
$ conda activate rsh_themes
Next, clone this repository into a parent folder:
(rsh_themes) $ cd /path/to/parent/folder
(rsh_themes) $ git clone https://github.com/UAL-ODIS/ReQUIAM_csv.git
With the activated conda
environment, you can install with the
setup.py
script:
(rsh_themes) $ cd /path/to/parent/folder/ReQUIAM_csv
(rsh_themes) $ (sudo) python setup.py develop
This will automatically installed the required numpy
and pandas
packages.
You can confirm installation via conda list
(rsh_themes) $ conda list requiam_csv
You should see that the version is 0.12.0
.
Configuration Settings¶
Configuration settings are specified through the default.ini file. These settings include the Google Sheet information and CSV file names (do not change as this will break ReQUIAM).
Testing Installation¶
To test the installation and create a temporary CSV file that does not
affect the main CSV file, the following command will run and generate a
file called dry_run.csv
:
(rsh_themes) $ python requiam_csv/script_run
Execution¶
By default, the script does a “dry run.” To execute the script and
override the main CSV file (data/research_themes.csv), include the
execute
argument
(rsh_themes) $ python requiam_csv/script_run --execute
Workflow¶
The recommended workflow to commit changes on the main CSV file is as follows:
First, switch to
develop
branch:git checkout develop
Conduct a dry run execution
Compare the two CSV files: ‘data/research_themes.csv’ and ‘data/dry_run.csv’
If the changes are what you expect, conduct the full execution
Update the version number in README.md,
__init__.py
, and setup.pyPerform a
git add
andgit commit
for ‘data/research_themes.csv’ and the above files todevelop
Create a pull request here
Update your local git repository with
git pull --all
Versioning¶
We use SemVer for versioning. For the versions available, see the tags on this repository.
Authors¶
Chun Ly, Ph.D. (@astrochun) - University of Arizona Libraries, Office of Digital Innovation and Stewardship
See also the list of contributors who participated in this project.
License¶
This project is licensed under the MIT License - see the LICENSE file for details.
API Documentation¶
ReQUIAM_csv package¶
Submodules¶
commons
module¶
-
requiam_csv.commons.
no_org_code_index
(df)¶ Identify entries without an Org Code. This is based on whether the value is set to NaN
- Parameters
df (
DataFrame
) – Research Themes dataframe- Return type
ndarray
- Returns
Array containing elements
create_csv
module¶
-
requiam_csv.create_csv.
create_csv
(url, outfile, log)¶ Generates a list of organization codes and associated portals for figshare account management.
The initial spreadsheet, which is curated by UA Libraries, is provided through the [url] input.
The exported CSV file will be placed in this git repo. Current path and file preference:
requiam_csv/data/research_themes.csv
- Parameters
url (
str
) – Full url to CSVoutfile (
str
) – Exported file in CSV formatlog (
Logger
) – Logger object
inspect_csv
module¶
-
requiam_csv.inspect_csv.
inspect_csv
(df, log)¶ Inspects Google Sheet CSV-export table to identify issues. Minor issues are logged. Major issues prevent creating the final CSV file.
- Minor issues include:
Entries without an ‘Org Code’ (i.e., empty rows). Minor because it is excluded in final export
- Major issues include:
Duplicate entries based on Org Code
Invalid/incorrect entries in ‘Departments/Colleges/Labs/Centers’ This result in not getting a proper Org Code
Missing ‘Research Themes’ or Sub-portals if either one is provided
- Parameters
df (
DataFrame
) – Research Themes dataframelog (
Logger
) – Logger object
logger
module¶
-
class
requiam_csv.logger.
LogClass
(log_dir, logfile)¶ Bases:
object
Main class to log information to stdout and ASCII logfile
- To use:
log = LogClass(log_dir, logfile).get_logger()
- Parameters
log_dir (
str
) – Relative path for exported logfile directorylogfile (
str
) – Filename for exported log file
-
get_logger
()¶ - Return type
Logger