Info for scholars

Welcome to Patentopia!  Patentopia extracts patents from the PatentsView database given one of two sets of inputs:  1) Inventor; or 2) Assignee name.  A time window can be selected in either the forward or backward direction; if forward, this allows the patents to be an outcome after a given date; backward allows you to use the patents as a control, predictor, or instrumental variable prior to a given date.  This is designed to be run on an observation level – for instance, a survey response date, company launch date, SBIR award date, or another reference. If no dates are listed in your dataset, Patentopia searches all available patents for all time.

Patentopia is designed to study US innovation and thus “all” patents refers only to US patents.

Options

You will have the option to select:

Inventor or Assignee name (you must pick one)

Forward or backward search – are you looking for a fixed time starting with a given date, or for a window prior? (you must pick one)

Time window, defined in years and months, with respect to a given date (i.e., a survey response date, award or grant date, or another observation-level date of interest).  Entering 0 years/0 months on the input screen causes Patentopia to look for patents for all time.  Similarly, if you do not enter a reference date in the input file, Patentopia will look for patents for all time.

If you select a time window, you may select if you would like the global citations to be extracted.  This option dramatically extends the processing time.

File format

All files should be in the form of a comma separated value (CSV) file.  Excel sheets should be saved as CSV files.  If you are working in excel with your own data, please see these instructions for saving your CSV file on a Mac or saving your CSV file on a Windows machine .

The column names must be exactly the following, and these column names are case sensitive!

id (optional) – an identification key or record number from your data set, if desired for reintegration into your data set

assignee (required for search by assignee, optional otherwise) – typically a firm but could be another entity (for instance, a university). Patentopia can also be used to find companies who have been assigned patents by searching for all patents in which the assignee name (USPTO field code: AN) overlaps.  It is not necessary to include incorporation flags such as LLC, Corp., Inc., etc.; but including them does not impact the code.  Note that if the company has patents through a separate agreement (for instance, through a license from a university), Patentopia will not find it because the university is still the assignee.

inventor (required for search by inventor, optional otherwise).   Patentopia was originally constructed to identify inventors serving as SBIR Principal Investigators (PIs).  The name column should include all elements of the name (i.e., first, last; or first, middle, last) in a single cell.  Patentopia will search for all patents in which the test name has a degree of overlap with the inventor name (USPTO field code: IN) and will generate a list of patents for exact match of the last name, and containing the first name. A fuzzy matching number specifying the degree of overlap between the full name and the target is estimated ONLY a full name is provided.

fullName (optional) – useful for projects in which the inventor’s name may be recorded inconsistently in different data sets – for instance, a federal award may has “Bobbie Smith” while the USPTO records may say “Roberta Smith.”  The full name field is an opportunity to identify another potential matching variable (in this case, you might record “Roberta” in the full name field.) This is the name on which matching is conducted when it is present. A fuzzy matching number specifying the degree of overlap between the name from the sample data and the target is estimated if this column is populated (if not, please see inventor name above). 

Important note:  If the inventor includes a middle name, the search will FAIL.  We strongly suggest processing the middle name out of the inventor name and recording a new fullName.

location (optional) – must be a two-letter abbreviation of one of the United States.

date (optional) – used as the defining date for forward or backward searches. Should be stated in yyyy-mm-dd format. Typically this is a survey response date, an award date, or other observation-level time stamp.

Challenges

Many challenges exist in searching the PatentsView database. Name matches are challenging as names may be misspelled, or many matches are found for common names.  Geographical matches (i.e., by state) are challenging because people generally do not live at work! – for instance, someone may live in Washington, DC and work in Maryland.

A more significant challenge is presented by the clustering algorithm of PatentsView, which may aggregate patents inaccurately.  Please see our technical note for details of how we reverse-engineer the clustering to confirm matches.  We frequently update the rawassignee and assignee data tables from PatentsView (see information here) and you will see the date of our last update when you run Patentopia.

Therefore, our process includes matching measures to estimate the overlap on two fronts: 1) between inventor names; 2) US state names.  This is just a guide (but faster than using the USPTO search function on a case-by-case basis!).

Send comments and suggestions here.

Outputs

Please review the complete output code books for assignee search and inventor search

Please look in your junk mail if you don’t receive the output files. Mails are sent from sender “Patentopia team”, with the subject “Patentopia Data Results yyyy-MM-dd” indicating the time stamp in Pacific Time.  All output files are labeled with username and the time stamp.

The three zipped files are:

Complete output – lists all relevant patents individually and matching measures: Each row represents one patent for the assignee or inventor.

Abridged output – aggregates patents based on the matches: Each row represents all patents for that assignee or inventor.

Timeout – We will be attaching a single zip of a csv file with failed results, if there are any inventor or assignee failed calls to patentView. The attached zip file will be named as Failed_inventor_results-yyyy-MM-dd in case of inventor request and in case of assignee it would be Failed_assignee_results-yyyy-MM-dd.

These files are labeled with your username and the time so you can identify the specific run.

The data fetch typically takes a few minutes for a file of ~100-500 records, but as this is hosted on AWS, the system may take even longer before sending you an email.  Typically the assignee search is faster than the PI search because of fewer disambiguation issues.

If the output exceeds 10,000 records in any of the output files, Patentopia will divide the output into multiple zip files and send them to you individually for aggregation.

Please send us your feedback and experience with run times to charlesp at usc dot com.

Getting help

Send emails with your questions to charlesp (at) usc dot edu.  You may also email if you do not receive an output within 24 hours.  When communicating, please:

Indicate the registered user ID on Patentopia

Please forward any error messages that you receive as they have the time stamps.

Enjoy!