python - how to append data to existing LMDB? -
i have around 1 million images put in dataset 10000 @ time appended set.
i"m sure map_size wrong ref article
used line create set
env = lmdb.open(path+'mylmdb', map_size=int(1e12)
use line every 10000 sample write data file x , y placeholders data put in lmdb.
env = create(env, x[:counter,:,:,:],y,counter) def create(env, x,y,n): env.begin(write=true) txn: # txn transaction object in range(n): datum = caffe.proto.caffe_pb2.datum() datum.channels = x.shape[1] datum.height = x.shape[2] datum.width = x.shape[3] datum.data = x[i].tostring() # or .tostring() if numpy < 1.9 datum.label = int(y[i]) str_id = '{:08}'.format(i) # encode essential in python 3 txn.put(str_id.encode('ascii'), datum.serializetostring()) #pdb.set_trace() return env
how can edit code such new data added lmdb , not replaced present method replaces in same position. have check length after generation env.stat().
le me expand on comment above.
all entries in lmdb stored according unique keys , database contains keys i = 0, 1, 2, ...
. need way find unique keys each i
. simplest way find largest key in existing db , keep adding it.
assuming existing keys consecutive,
max_key = env.stat()["entries"]
otherwise, more thorough approach iterate on keys. (check this.)
max_key = 0 key, value in env.cursor(): max_key = max(max_key, key)
finally, replace line 7 of for
loop,
str_id = '{:08}'.format(i)
by
str_id = '{:08}'.format(max_key + 1 + i)
to append existing database.
Comments
Post a Comment