Loop strmatch, then dataset intersection, then extract data subset by observation


2 weeks ago


4 time


Step 1: Find intersection between two datasets. I would like to find the intersection between biglist and matchlist.

use matchlist, clear //matchlist contains unique observations that I need
levelsof countryname, local(country1)
use biglist, clear //big list has a lot of duplicates and things I don't need
levelsof countryname, local(country2)

local common_country: list country1 & country2
display `common_country'
//This produces one long string of all countries, not comma-delimited

keep if strpos(countryname, `"`common_country'"')
//Therefore this produces an empty dataset

Step 2: Extact one dataset per country

set trace on
fs "*.dta" // this is a user-written library to search through computer folders
foreach i in `r(files)' {
        use "`i'"
        levelsof countryname, local(countryname)
        foreach c of local countryname {
        keep if strmatch(countryname, "`c'")
        save "`c'_`i'", replace
        restore, preserve //this code keeps randomly failing, sometimes it says "already preserved", please review for general consistency

Third Step: Create one dataset per country Now I have a folder of more than 100 small datasets that start with a countryname (e.g. "Francexxx_.dta"). And I would like to merge all of these small datasets into a single dataset per countryname. The countrynames come from the matchlist.

fs "*.dta"
foreach i in `r(files)' {
    //placeholder: add loop level "foreach country in countryname of matchlist"
    joinby countryname year  using "`i'"
    *cap drop _merge

0 answers