Randomly shuffle values from a column into a series of new columns in a feature class in ArcGIS for Desktop?
We are searching data for your request:
Upon completion, a link will appear to access the found materials.
I am trying to run a Monte Carlo simulation on a point feature class. In order to do so, I am trying to take the values from a column of observed data and randomly "shuffle" them between the points over n additional "simulated" columns. In my case, I'd like to add 199 new columns, each with the values of the the observed data randomly redistributed among the points.
My observed data are binary (1 and 0)
I have been able to do this in R previously and bring the table back into ArcGIS, but the tables have become so unwieldly that I am having difficulty bringing them back into ArcGIS, and would like to run the analysis in ArcGIS, rather than moving in and out.
Does anyone have any idea how to do this? I have tried using cursor, but have failed.
You can accomplish this task using the following workflow:
Generate a list of all the values (1's and 0's) in a field
vals = [row for row in arcpy.da.SearchCursor(fc, ["binary"])]
Create a loop equal to the number of fields you want to create
for x in range(2):
Within that loop, randomly shuffle the list of 1's and 0's
Incorporate the range value into the Field name
fieldName = "Field" + str((x + 1))
Start an UpdateCursor to update the field in range x
with arcpy.da.UpdateCursor(fc, fieldName) as cursor:
Perhaps most important, create a counter and use that to index the shuffled list. Add that indexed value to the attribute table
row = int(vals[count])
import arcpy, os, random # Your input feature class fc = r'C:path ofeatureclass' # Get a list of all the values (i.e. 1's and 0's) in the field called "binary" vals = [row for row in arcpy.da.SearchCursor(fc, ["binary"])] # In your case, you probably want to change the range to 199 for x in range(2): random.shuffle(vals) #Randomly shuffle list fieldName = "Field" + str((x + 1)) arcpy.AddField_management(fc, fieldName, "SHORT") count = 0 with arcpy.da.UpdateCursor(fc, fieldName) as cursor: for row in cursor: row = int(vals[count]) count = count + 1 cursor.updateRow(row) del count
Here's something that I put together, it woks in mostly the same way as Aarons' script.
import sys, os, arcpy, random InFC = sys.argv ThresholdOfOnes = 50 # percent that is! Columns = 199 ColRange = range(1,Columns + 1) InCount = int(arcpy.GetCount_management(InFC).getOutput(0)) ThresholdOfOnes = (InCount * ThresholdOfOnes) / 100 # convert percent into a number OIDlist = list() desc = arcpy.Describe(InFC) OIDfield = desc.OIDFieldName # get the OIDs into a list with arcpy.da.SearchCursor(InFC,OIDfield) as srch: for row in srch: OIDlist.append(row) for ColNum in ColRange: FieldName = "Col_%d" % ColNum SelOIDlist = list() while len(SelOIDlist) < ThresholdOfOnes: PosIndex = random.randint(0,len(OIDlist)) # pick a random number if OIDlist[PosIndex] not in SelOIDlist: # if the OID isn't already in the list SelOIDlist.append(OIDlist[PosIndex]) # now to select with the oidlist # start building a definiton query DefQ = "%s in (" % OIDfield isFirst = True for ThisOID in SelOIDlist: if isFirst: # the first one doesn't get a comma DefQ = DefQ + str(ThisOID) isFirst = False else: DefQ = DefQ + "," + str(ThisOID) DefQ = DefQ + ")" # close the bracket # add the field if it doesn't exist fList = arcpy.ListFields(InFC,FieldName) if not fList: arcpy.AddField_management(InFC,FieldName,"SHORT") # blank the field arcpy.CalculateField_management(InFC,FieldName,"0","PYTHON") # start a cursor with just the random features with arcpy.da.UpdateCursor(InFC,FieldName,DefQ) as UpCur: for UpRow in UpCur: UpRow = 1 # set the value UpCur.updateRow(UpRow) # store it
This will work for shapefiles or feature classes in databases… it gets a list of the oids and then randomly picks from the list (ensuring no duplication) until the threshold is reached then makes the list into a definition query and updates the features. Along the way if the field doesn't exist it will add it, ensuring that previous values are overwritten so if you need to re-random your feature class it's safe to run again.
Some fancy dictionary thing and a one-pass update cursor would be the fastest/best, but something like this would work too:
import arcpy, random pntFC = r"C: emp est.gdb est_pnts" boolField = "ORIG_TRUTH" #original bool field falseOidList = [r for r in arcpy.da.SearchCursor(pntFC, ["[email protected]"], boolField + " = 0")] trueOidList = [r for r in arcpy.da.SearchCursor(pntFC, ["[email protected]"], boolField + " = 1")] falseCount = len(falseOidList) trueCount = len(trueOidList) for i in range(1,199 + 1): randomFieldName = "RND_" + str(i) arcpy.AddField_management(randomFieldName, "SHORT") randomTrueOids = random.sample(falseOidList + trueOidList, trueCount) arcpy.MakeFeatureLayer_management(pntFC, "fl", "OID in (" + ",".join(str(oid) for oid in (falseOidList + trueOidList)) + ")" arcpy.CalculateField_management("fl", randomFieldName, "1", "PYTHON") arcpy.MakeFeatureLayer_management(pntFC, "fl", randomFieldName + " IS NULL") arcpy.CalculateField_management("fl", randomFieldName, "0", "PYTHON")