Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Eela6
May 25, 2007
Shredded Hen

JVNO posted:

Wow, great responses and super quick. Unfortunately the responses aren’t easily applied to my own program- and I decided instead to rebuild the program in a way that obviated the need for removal.

For anyone curious, I needed a list generated that includes 20 of each of the following:

NR
L0
L2P
L2T
L4P
L4T
L8P
L8T

For a total of 160 items in the list. All of these stand for different experimental conditions, and are randomly presented, but some conditions are related. The rules are:

L0 and NR can go anywhere in the list that doesn’t conflict with another rule.
For n = L2P, n + 1 = L2T
For n = L4P, n + 2 = L4T
For n = L8P, n + 4 = L8T

I’m phone posting now, but my new approach is to add the L(X)P items to the list at the start, shuffle the order, and use that as a seed for a pseudo-random procedural generator. The procedural generator will then populate the list with L(X)T items, using L0/NR items as filler when necessary.

It’s a heck of a lot more complicated than I thought ought to be necessary (~150 lines of code), and is slower than my usual experiment list generator, but I’m ironing out a final couple bugs (usually missing L(X)P items) and it appears to work.

This is an interesting problem that's more difficult than it appears. I spent a little bit of time fiddling with it and wasn't able to find a solution that preserved 'true' randomness (i.e, all valid strings are equally likely given the limits of the prng) that wasn't O(n2) or worse.

If it's not sensitive, would you mind showing me what you end up with?

Adbot
ADBOT LOVES YOU

Seventh Arrow
Jan 26, 2005

I'm trying to pass a json file to an Amazon DynamoDB table and it's turning up its nose at it. To whit:

code:
Traceback (most recent call last):
  File "lo_populate.py", line 13, in <module>
    collector_key = int(data['collector_key'])
TypeError: int() argument must be a string, a bytes-like object or a number, not 'dict'
I highly suspect it doesn't like the way that the json file is formatted - a file that I got via the pandas "df.to_json" method. It works ok with the Amazon-provided example. Here's Amazon's code:

code:
from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal

dynamodb = boto3.resource('dynamodb', region_name='us-west-2', aws_access_key_id='123456789', aws_secret_access_key='987654321')

table = dynamodb.Table('Movies')

with open("moviedata.json") as json_file:
    movies = json.load(json_file, parse_float = decimal.Decimal)
    for movie in movies:
        year = int(movie['year'])
        title = movie['title']
        info = movie['info']

        print("Adding movie:", year, title)

        table.put_item(
           Item={
               'year': year,
               'title': title,
               'info': info,
            }
        )
Here's my code:

code:
from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal

dynamodb = boto3.resource('dynamodb', region_name='us-west-2', aws_access_key_id='123456789', aws_secret_access_key='987654321')

table = dynamodb.Table('Loyalty_One')

with open("loyalty_one.json") as json_file:
    data = json.load(json_file, parse_float = decimal.Decimal)
    for numbers in data:
        collector_key = int(data['collector_key'])
        sales = int(data['sales'])
        store_location_key = int(data['store_location_key'])

        print("Adding data:", collector_key, sales)

        table.put_item(
           Item={
               'collector_key': collector_key,
               'sales': sales,
               'store_location_key': store_location_key,
            }
        )
Amazon's json is formatted like this:

code:
[
    {
        "year": 2013,
        "title": "Rush",
        "info": {
            "directors": ["Ron Howard"],
            "release_date": "2013-09-02T00:00:00Z",
            "rating": 8.3,
            "genres": [
                "Action",
                "Biography",
                "Drama",
                "Sport"
            ],
            "image_url": "http://ia.media-imdb.com/images/M/MV5BMTQyMDE0MTY0OV5BMl5BanBnXkFtZTcwMjI2OTI0OQ@@._V1_SX400_.jpg",
            "plot": "A re-creation of the merciless 1970s rivalry between Formula One rivals James Hunt and Niki Lauda.",
            "rank": 2,
            "running_time_secs": 7380,
            "actors": [
                "Daniel Bruhl",
                "Chris Hemsworth",
                "Olivia Wilde"
            ]
        }
    }
The pandas-provided json is formatted like this:

code:
{"collector_key":{"0":-1,"1":-1,"2":139517343969,"3":-1,"4":-1,"5":-1,"6":-1,"7":-1,"8":134466064080,"9":-1,"10":-1,"11":-1,"12":-1,"13":-1,"14":-1,"15":-1,"16":-1,"17":-1,"18":-1,"19":-1,"20":-1,"21":-1,"22":-1,"23":-1,"24":-1,"25":-1,"26":-1,"27":-1,"28":-1,"29":-1,"30":134542895920,"31":-1,"32":-1,"33":-1,"34":141237957890,"35":-1,"36":-1,"37":134506730140,"38":-1,"39":-1,"40":-1,"41":-1,"42":-1,"43":-1,"44":-1,"45":-1,"46":139519041189,"47":-1,"48":-1,"49":-1,"50":134515595594,"51":-1,"52":-1,"53":-1,"54":-1,"55":-1,"56":-1,"57":141154004799,"58":-1,"59":-1,"60":-1,"61":-1,"62":-1,"63":-1,"64":-1,"65":-1,"66":-1,"67":139445093521,"68":-1,"69":-1,"70":-1,"71":-1,"72":-1,"73":-1,"74":134461800356,"75":-1,"76":-1,"77":-1,"78":-1,"79":-1,"80":-1,"81":-1,"82":-1,"83":-1,"84":-1,"85":-1,"86":-1,"87":-1,"88":-1,"89":-1,"90":-1,"91":-1,"92":-1,"93":-1,"94":-1,"95":-1,"96":-1,"97":-1,"98":-1,"99":-1,"100":-1,"101":-1,"102":137775494549,"103":-1,"104":139482038795,"105":-1,"106":-1,"107":-1,"108":-1,"109":-1,"110":-1,"111":-1,"112":139481581803,"113":141334108482,"114":-1,"115":139447016485,"116":-1,"117":-1,"118":-1,"119":-1,"120":-1,"121":-1,"122":-1,"123":-1,"124":-1,"125":-1,"126":137269101935,"127":-1,"128":-1,"129":-1,"130":-1,"131":-1,"132":134599141616,"133":-1,"134":-1,"135":141164751331,"136":-1,"137":-1,"138":134511575117,"139":-1,"140":-1,"141":-1,"142":141216773515,"143":141172087745,"144":-1,"145":-1,"146":-1,"147":-1,"148":-1,"149":-1,"150":-1,"151":134504372376,"152":-1,"153":137773550141,"154":-1,"155":-1,"156":-1,"157":-1,"158":-1,"159":-1,"160":134550134225,"161":142810638272,"162":-1,"163":-1,"164":-1,"165":141267504034,"166":-1,"167":-1,"168":-1,"169":-1,"170":-1,"171":134508093374,"172":137308189264,"173":-1,"174":-1,"175":-1,"176":-1,"177":-1,"178":-1,"179":-1,"180":-1,"181":-1,"182":-1,"183":-1,"184":-1,"185":-1,"186":-1,"187":-1,"188":-1,"189":-1,"190":-1,"191":-1,"192":-1,"193":-1,"194":-1,"195":134413126813,"196":134564945232,"197":-1,"198":-1,"199":-1,"200":-1,"201":-1,"202":-1,"203":-1,"204":-1,"205":137312918445,"206":-1,"207":-1,"208":-1,"209":-1,"210":-1,"211":-1,"212":-1,"213":-1,"214":134520625411,"215":-1,"216":-1,"217":-1,"218":141334106710,"219":-1,"220":137320409807,"221":-1,"222":-1,"223":-1,"224":139519192126,"225":139448304744,"226":137308989848,"227":-1,"228":-1,"229":-1,"230":-1,"231":141159799464,"232":-1,"233":-1,"234":-1,"235":134504372376,"236":140205185902,"237":-1,"238":137292699615,"239":-1,"240":-1,"241":-1,"242":-1,"243":139469669874,"244":-1,"245":-1,"246":-1,"247":-1,"248":-1,"249":-1,"250":-1,"251":-1,"252":-1,"253":-1,"254":-1,"255":-1,"256":-1,"257":134590736097,"258":134515595594,"259":-1,"260":-1,"261":-1,"262":-1,"263":-1,"264":-1,"265":-1,"266":-1,"267":139503644209,"268":-1,"269":-1,"270":134517860724,"271":-1,"272":137270898191,"273":-1,"274":-1,"275":-1,"276":-1,"277":-1,"278":-1,"279":137263876981,"280":-1,"281":137320409807,"282":-1,"283":137763659791,"284":134415608855,"285":-1,"286":-1}}
So I get the feeling that python looks at the curly quotes within curly quotes and maybe sees it as a self-contained list instead of being a bunch of discreet items. Does that seem right? And if so, is there maybe a way to get python to format the data before it sends it over? Because as far as I know, there's no native way to export dataframes to DynamoDB.

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL
json.loads is turning your pandas json file into a nested dict. You're getting yelled at because "collector_key" is in fact a key whose value is a dict, and you can't turn a dict into an int.

E: It looks like your code is assuming that you have a collector key for each number, but what you actually have is one collector key with a bunch of numbers.

E2: But that still wouldn't work with your code now, because you're accessing "collector_key" from the same outer dict that you are trying to iterate over.

Dr Subterfuge fucked around with this message at 02:44 on Mar 3, 2018

Seventh Arrow
Jan 26, 2005

Wait, I just realized that the index - or row number, whatever you want to call it - is being included in "collector_key":

{"collector_key":{"0":-1,"1":-1,"2":139517343969,"3":-1,"4":-1...}

All those numbers that are bolded are totally superfluous. I wonder if removing them (if possible) will fix the problem.

Nippashish
Nov 2, 2005

Let me see you dance!

Seventh Arrow posted:

Wait, I just realized that the index - or row number, whatever you want to call it - is being included in "collector_key":

{"collector_key":{"0":-1,"1":-1,"2":139517343969,"3":-1,"4":-1...}

All those numbers that are bolded are totally superfluous. I wonder if removing them (if possible) will fix the problem.

No this is not what you need to do. to_json is not storing "superflous" data, but it's also not storing the data in the format you need. When you look at a json file you can read [] and {} just like you do in python. When you call json.load on the file [] becomes a list and {} becomes a dict.

What's going on is you have a table of data like:
code:
   A  B  C
0 a0 b0 c0
1 a1 b1 c1
2 a2 b2 c2
The dynamodb expects a sequence of records, like this:
code:
[
  {A: a0 B: b0 C: c0},
  {A: a1 B: b1 C: c1},
  {A: a2 B: b2 C: c2},
]
(i.e this is a list of dicts and the for movie in movies loop runs over the list).

Pandas .to_json stores the data differently though. to_json stores it like this:
code:
{
  A: { 0: a0, 1: a1, 2: a2 },
  B: { 0: b0, 1: b1, 2: c2 },
  C: { 0: c0, 1: c1, 2: b2 },
}
This is a dictionary of columns, where each column is a dictionary mapping index -> value (the index is there because pandas lets you use non-integer non-contiguous indexes It can't just write them down in order because you could have an index like [0,1,23243254324324] instead of [0,1,2]).

For what you are trying to do I think the "json" part is leading you down a garden path. You need to do something like this:
code:
from __future__ import print_function # Python 2/3 compatibility
import boto3
import pandas as pd

dynamodb = boto3.resource(...)

table = dynamodb.Table('Loyalty_One')

with open("loyalty_one.json") as json_file:
    data = pd.read_json(json_file)
    for numbers in data.iterrows():
        collector_key = int(numbers['collector_key'])
        sales = int(numbers['sales'])
        store_location_key = int(numbers['store_location_key'])

        # ... continue as before

Seventh Arrow
Jan 26, 2005

Thanks for your reply. I was just thinking of this, and maybe it would be possible to read each row directly from the dataframe and not even bother with the json file? I'm not sure (yet) how to call a dataframe row by row, but maybe it's better to eliminate the extra step of going to a file in the first place.

Nippashish
Nov 2, 2005

Let me see you dance!

Seventh Arrow posted:

I'm not sure (yet) how to call a dataframe row by row

That's what iterrows does.

Seventh Arrow
Jan 26, 2005

I guess I'll have to look it up, but doesn't pandas using iloc (or something) to call on a given row?

Anyways, I'll give your code a try - many thanks!

SurgicalOntologist
Jun 17, 2004

Check the orient argument of to_json; pandas offers four different ways to organize the JSON. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html

vikingstrike
Sep 23, 2007

whats happening, captain
One is for indexing one is for iterating.

Wallet
Jun 19, 2006

I have what might be a dumb question which I've tried to explain comprehensibly with questionable success, but hopefully someone can point me in the right direction.

I'm dealing with anywhere from a hundred to a few thousand objects in a list (or not in a list, but the order of the objects is relevant), each of which have ~five attributes which I'll call object.a through object.e for simplicity's sake. What I need to do is find certain patterns of objects with certain attributes: for example, I might need any two objects that do not have the same value for object.a, do have the same value for object.b, and are not separated by any objects that have one of a particular set of values for object.d.

There are 30 or so patterns I need to match, many of which don't have a fixed upper bound for how many objects can fit in different positions within the pattern, and many of which require matching the values of objects to each other or the number of objects fulfilling one condition to the number fulfilling another condition (ex: any number of consecutive objects with the same .a value followed by the same number of objects with the same .b value).

I could write an individual function to handle each pattern individually, but I'd rather not for obvious reasons, and I can imagine performance becoming a nightmare pretty quickly. Basically, I need the functionality of regex but for objects and across multiple attributes. What's the most reasonable way to approach this?

Seventh Arrow
Jan 26, 2005

Wouldn't it be possible to do a regex with an if/elif/else setup?

QuarkJets
Sep 8, 2008

What are those attributes? Strings? Floats? Other classes?

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!

Eela6 posted:

This is an interesting problem that's more difficult than it appears. I spent a little bit of time fiddling with it and wasn't able to find a solution that preserved 'true' randomness (i.e, all valid strings are equally likely given the limits of the prng) that wasn't O(n2) or worse.

If it's not sensitive, would you mind showing me what you end up with?

I'll post the full solution after bugfixing, but I'll be pretty honest here: I'm kind of brute-forcing it with 'if' handlers for every legal sequence order With enough artificial variability to appear random.

So the placement of L2T and L4T are simple enough:

code:
import time
import sys
from random import *
import os
from psychopy import visual,event,core,gui
import pyaudio
import audioop


word_list1 = [	'G R A I N',	'G L A R E',	'P H O N E',	'S C A L E',	'G U E S S',	'S T R A W',	'Q U O T E',	'D A I L Y',	'S H A M E',	'G R I E F',	'S H E E T',	'S L I D E',	'B L A Z E',	'H E A R T',	'C H I L L',	'G R A S P',	'W H E E L',	'S P L I T',	'W H I R L',	'I S S U E']

temp_trials = []
for i in range(len(word_list1)):
    temp_trials.append('NR')
    temp_trials.append('L0')
shuffle(temp_trials)
temp_trials_inc = 0
print temp_trials

study_trials = []
for i in range(len(word_list1)): 
    study_trials.append('L2P')
    study_trials.append('L4P')
    study_trials.append('L8P')
shuffle(study_trials)
print study_trials
print len(study_trials)

list_iteration = 0
length = len(study_trials)
while list_iteration < length:
    if study_trials[list_iteration] == 'L2P':
        study_trials.insert(list_iteration + 1,'L2T')
        length += 1
    list_iteration += 1
print study_trials
print len(study_trials)

list_iteration = 0
length = len(study_trials)
while list_iteration < length:
    add_items = 0
    if study_trials[list_iteration] == 'L4P':
        if (list_iteration + 1) == length:
            study_trials.append(temp_trials[temp_trials_inc])
            temp_trials_inc += 1
            study_trials.append('L4T')
            add_items += 2
        elif study_trials[list_iteration + 1] == 'L2P':
            study_trials.insert(list_iteration + 1, temp_trials[temp_trials_inc])
            temp_trials_inc += 1            
            study_trials.insert(list_iteration+2,'L4T')
            add_items += 2
        else:
            study_trials.insert(list_iteration+2,'L4T')
            add_items += 1
    length += add_items
    list_iteration += 1
print study_trials
print len(study_trials)
print study_trials.count('L4T')
I've completed most of the coding for the L8T trials... But I'm still dealing with a few bugs.

Seventh Arrow
Jan 26, 2005

I've seen this before, why would you do "from random import *" instead of just "import random"?

Data Graham
Dec 28, 2009

📈📊🍪😋



To make the rest of your code more opaque and confusing, duh.

:v:

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL
It puts everything from the imported package into your global namespace, so you can do things like call shuffle() directly instead of calling random.shuffle(). Practically its advantage is it cuts down on typing. Maybe there are other reason to do it that I am not aware of. It's generally not a good idea though because it imports everything implicitly, which makes it harder to understand where something like shuffle is defined, and it could cause hidden conflicts if you have something else with the same name in your global namespace. You can get the same behavior more explicitly by doing "from random import shuffle as shuffle" and you only get what you want.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!

Seventh Arrow posted:

I've seen this before, why would you do "from random import *" instead of just "import random"?

Functionally:

Dr Subterfuge posted:

It puts everything from the imported package into your global namespace, so you can do things like call shuffle() directly instead of calling random.shuffle().


But here's the Overly Honest Methods answer: The 'import' section of the code is part of a multiple-generations old experiment done by a long-graduated PhD that I have inherited and modified as necessary. and, well, 'if it ain't broke...'

Wallet
Jun 19, 2006

Seventh Arrow posted:

Wouldn't it be possible to do a regex with an if/elif/else setup?

That's the first approach that came to mind, but it seems pretty cumbersome.

QuarkJets posted:

What are those attributes? Strings? Floats? Other classes?

Strings mostly, a few are booleans.


JVNO posted:

I've completed most of the coding for the L8T trials... But I'm still dealing with a few bugs.

Given that it doesn't need to be truly random, it seems like it would be easier to create a function that finds all valid indices where a given form can be inserted and then have it pick a random one, although you would create configurations that would be impossible to complete some percentage of the time.

Edit: Like this, but less poo poo/lazy, probably with some if statements for find_spot managing to find no valid positions, and maybe even some randomness in the order of insertion (it's been a long day, but this seems to work correctly):

Python code:
from random import SystemRandom


def find_spot(t, increment):
    valid = []
    for index, i in enumerate(t[:-increment]):
        if not i and not t[index + increment]:
            valid.append(index)
    SystemRandom().shuffle(valid)
    return valid[0]


trials = [None for i in range(160)]

for i in range(20):
    pos = find_spot(trials, 4)
    trials[pos] = 'L8P'
    trials[pos + 4] = 'L8T'
    pos = find_spot(trials, 2)
    trials[pos] = 'L4P'
    trials[pos + 2] = 'L4T'
    pos = find_spot(trials, 1)
    trials[pos] = 'L2P'
    trials[pos + 1] = 'L2T'

single_items = []
for i in range(20):
    single_items.extend(['NR', 'L0'])
SystemRandom().shuffle(single_items)

for index, i in enumerate(trials):
    if not i:
        trials[index] = single_items.pop()

print(trials)

Wallet fucked around with this message at 01:17 on Mar 4, 2018

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
Welp, this is the best I could come up with after hours and hours of coding and troubleshooting:

Python code:
import time
import sys
from random import *
import os
from psychopy import visual,event,core,gui
import pyaudio
import audioop


word_list1 = [	'G R A I N',	'G L A R E',	'P H O N E',	'S C A L E',	'G U E S S',	'S T R A W',	'Q U O T E',	'D A I L Y',	'S H A M E',	'G R I E F',	'S H E E T',	'S L I D E',	'B L A Z E',	'H E A R T',	'C H I L L',	'G R A S P',	'W H E E L',	'S P L I T',	'W H I R L',	'I S S U E']

#Create a randomized list of trials with no placement requirements
print 'Create Filler Trial List'
filler_trials = []
for i in range(len(word_list1)):
    filler_trials.append('NR')
    filler_trials.append('L0')
shuffle(filler_trials)
filler_trials_inc = 0
print ('Temp Trials: ', filler_trials)
print '________________________________________________'

#Create a randomized list of the prime trials
study_trials = []
for i in range(len(word_list1)): 
    study_trials.append('L2P')
    study_trials.append('L4P')
    study_trials.append('L8P')
shuffle(study_trials)
print 'Study Trials: '
print study_trials
print '# of Study Trials: ', len(study_trials)
print '________________________________________________'

#Add the L2T trials
list_iteration = 0
length = len(study_trials)
while list_iteration < length:
    if study_trials[list_iteration] == 'L2P':
        study_trials.insert(list_iteration + 1,'L2T')
        length += 1
    list_iteration += 1
print 'Study Trials + L2T: '
print study_trials
print '# of Study Trials: ', len(study_trials)
print '# of L2T Trials: ',study_trials.count('L2T')
print '________________________________________________'

#Add the L4T trials
list_iteration = 0
while list_iteration < length:
    add_items = 0
    if study_trials[list_iteration] == 'L4P':
        if (list_iteration + 1) == length:
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1
            study_trials.append('L4T')
            add_items += 2
        elif study_trials[list_iteration + 1] == 'L2P':
            study_trials.insert(list_iteration + 1, filler_trials[filler_trials_inc])
            filler_trials_inc += 1            
            study_trials.insert(list_iteration+2,'L4T')
            add_items += 2
        else:
            study_trials.insert(list_iteration+2,'L4T')
            add_items += 1
    length += add_items
    list_iteration += 1
print 'Study Trials + L4T: '
print study_trials
print '# of Study Trials: ', len(study_trials)
print '# of L4T Trials: ',study_trials.count('L4T')
print '# of Filler Trials: ',filler_trials_inc
print '________________________________________________'

#Add the L8T trials
list_iteration = 0
while list_iteration < length:
    add_items = 0 
    if study_trials[list_iteration] == 'L8P':
        if (list_iteration + 1) == length:
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1            
            study_trials.append('L8T')
            add_items += 4
        elif (list_iteration + 2) == length:
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1 
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1 
            study_trials.append('L8T')
            add_items += 3
        elif (list_iteration + 3) == length:
            study_trials.append(filler_trials[filler_trials_inc])
            filler_trials_inc += 1 
            study_trials.append('L8T')
            add_items += 2
        if (list_iteration + 3) < length:
            if study_trials[list_iteration + 2] != 'L4P' and study_trials[list_iteration + 3] != 'L2P': 
                study_trials.insert(list_iteration + 4, 'L8T')
                add_items += 1
            elif study_trials[list_iteration + 3] == 'L2P':
                study_trials.insert(list_iteration + 3, filler_trials[filler_trials_inc])
                filler_trials_inc += 1 
                study_trials.insert(list_iteration + 4, 'L8T')
                add_items += 2
            elif study_trials[list_iteration + 2] == 'L4P': 
                study_trials.insert(list_iteration + 2, filler_trials[filler_trials_inc])
                filler_trials_inc += 1 
                study_trials.insert(list_iteration + 3, filler_trials[filler_trials_inc])
                filler_trials_inc += 1 
                study_trials.insert(list_iteration + 4, 'L8T')
                add_items += 3
    length += add_items
    list_iteration += 1
print 'Study Trials + L8T: '
print study_trials
print '# of Study Trials: ', len(study_trials)
print '# of L8T Trials: ',study_trials.count('L8T')
print '# of Filler Trials: ',filler_trials_inc
print '________________________________________________'

#Add remaining conditionless trials at random
while filler_trials_inc < (len(filler_trials)):
    print 'length: ', length
    position = randint(0, length)
    print 'checking study list position: ', position
    add_items = 0
    if (study_trials[position - 4] != 'L8P' and study_trials[position - 3] != 'L8P' and study_trials[position - 2] != 'L8P' and study_trials[position - 1] != 'L8P') and (study_trials[position - 2] != 'L4P' and study_trials[position - 1] != 'L4P' ) and (study_trials[position - 1] != 'L2P') and (study_trials[position - 1] != 'NR' or study_trials[position - 1] != 'L0'): 
        study_trials.insert(position, filler_trials[filler_trials_inc])
        print filler_trials[filler_trials_inc], 'placed!'
        filler_trials_inc += 1
        add_items += 1
        print study_trials
    else:
        print 'NOT VALID'
        pass     
    length += add_items
print 'Study Trials + Filler: '
print study_trials
print '# of Study Trials: ', len(study_trials)
print '# of Filler Trials: ',filler_trials_inc

quote:

Given that it doesn't need to be truly random, it seems like it would be easier to create a function that finds all valid indices where a given form can be inserted and then have it pick a random one, although you would create configurations that would be impossible to complete some percentage of the time.

Edit: Like this, but less poo poo/lazy, probably with some if statements for find_spot managing to find no valid positions, and maybe even some randomness in the order of insertion (it's been a long day, but this seems to work correctly):

I need to test this for my own purposes, but if this works, it's much more elegant. I have a lot yet to learn about python :v:

Edit: Welp, that's a story as old as time. Spend days working on a piece of code only to have a much simpler solution presented after you finally figure it out. That version works and is a hell of a lot better than my code. Hope you don't mind me yanking that for my experiments?

PoizenJam fucked around with this message at 01:33 on Mar 4, 2018

Wallet
Jun 19, 2006

JVNO posted:

Edit: Welp, that's a story as old as time. Spend days working on a piece of code only to have a much simpler solution presented after you finally figure it out. That version works and is a hell of a lot better than my code. Hope you don't mind me yanking that for my experiments?

Go for it, happy to help. Just mind that I think it's theoretically possible for it to get itself into a state where it can't finish, which you might want to account for.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!

Wallet posted:

Go for it, happy to help. Just mind that I think it's theoretically possible for it to get itself into a state where it can't finish, which you might want to account for.

I ran 100 000 iterations of your list generation with no errors, so I'll take my chances :v:

I think the single items provide enough degrees of freedom for list ordering that it's impossible to generate an invalid list set.

Wallet
Jun 19, 2006

JVNO posted:

I ran 100 000 iterations of your list generation with no errors, so I'll take my chances :v:

I think the single items provide enough degrees of freedom for list ordering that it's impossible to generate an invalid list set.

Fair enough; I wasn't sure if the distribution was always the same or not, and I was also too lazy to test it 100,000 times.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Anyone here use Pandas to generate reports for end-users? If so, what does your workflow look like? I'm stuck a bit in the middle where my current process is to use Python to do data cleanup but then I load the data in an Excel workbook for charts and pivot tables to share with users.

vikingstrike
Sep 23, 2007

whats happening, captain
What type of reports are you thinking? Out of what you describe, the logical addition would be matplotlib/seaborn to plot figures in python.

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL
There are also ways to automate Excel file creation from python if you haven't already gone that route.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

vikingstrike posted:

What type of reports are you thinking? Out of what you describe, the logical addition would be matplotlib/seaborn to plot figures in python.

I work in healthcare and my current report goes to department managers and shows staff compliance for documentation of a certain procedure. The vast majority of managers are not technical but they are comfortable enough to open up the Excel workbook I email them and at least look at the first chart that shows how their department is doing against the hospital.

If there is a way to paste an image inline in Outlook 2013, I've thought about removing the workbook entire and generate an email for each department and paste the charts and table inside the email body. Basically trying to spoon feed the end-user as much as possible to make their life easier.

vikingstrike
Sep 23, 2007

whats happening, captain
You can send email using python and write it in a way that the email should be in line. Been a while since I’ve done this, but this be easy to automate. Data cleaning -> figure generation -> email.

Sad Panda
Sep 22, 2004

I'm a Sad Panda.
Python question, how do you deal with doing something twice? I'm making a game of self-playing Blackjack and have the following...

Python code:
    def initial_deal(self):
        ''' Initial deal.'''
        new_deck.add_next_card_to_hand(self.players_hand.hand)
        new_deck.add_next_card_to_hand(self.players_hand.hand)
        new_deck.add_next_card_to_hand(self.dealers_hand.hand)
        new_deck.add_next_card_to_hand(self.dealers_hand.hand)
That seems the simplest way of doing it, and the player will never get more than 2 hands during the initial deal. I might change it so the dealer can get one or two one day, but that seems by far the easiest way to do it rather than a loop to do it twice.

QuarkJets
Sep 8, 2008

I would use a for loop to do the thing twice.

Seventh Arrow
Jan 26, 2005

Nippashish posted:

No this is not what you need to do. to_json is not storing "superflous" data, but it's also not storing the data in the format you need. When you look at a json file you can read [] and {} just like you do in python. When you call json.load on the file [] becomes a list and {} becomes a dict.

What's going on is you have a table of data like:
code:
   A  B  C
0 a0 b0 c0
1 a1 b1 c1
2 a2 b2 c2
The dynamodb expects a sequence of records, like this:
code:
[
  {A: a0 B: b0 C: c0},
  {A: a1 B: b1 C: c1},
  {A: a2 B: b2 C: c2},
]
(i.e this is a list of dicts and the for movie in movies loop runs over the list).

Pandas .to_json stores the data differently though. to_json stores it like this:
code:
{
  A: { 0: a0, 1: a1, 2: a2 },
  B: { 0: b0, 1: b1, 2: c2 },
  C: { 0: c0, 1: c1, 2: b2 },
}
This is a dictionary of columns, where each column is a dictionary mapping index -> value (the index is there because pandas lets you use non-integer non-contiguous indexes It can't just write them down in order because you could have an index like [0,1,23243254324324] instead of [0,1,2]).

For what you are trying to do I think the "json" part is leading you down a garden path. You need to do something like this:
code:
from __future__ import print_function # Python 2/3 compatibility
import boto3
import pandas as pd

dynamodb = boto3.resource(...)

table = dynamodb.Table('Loyalty_One')

with open("loyalty_one.json") as json_file:
    data = pd.read_json(json_file)
    for numbers in data.iterrows():
        collector_key = int(numbers['collector_key'])
        sales = int(numbers['sales'])
        store_location_key = int(numbers['store_location_key'])

        # ... continue as before

It looks like I'm still having a bit of a rough time with this. I decided to try it with a csv file instead to see if it's less fussy and I'm not so sure. I realized that "iterrows" only works on dataframes (at least, I think so) so I tried to update my code accordingly:

code:
from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal
import pandas as pd

dynamodb = boto3.resource('dynamodb', region_name='us-west-2', aws_access_key_id='123456789', aws_secret_access_key='987654321')

table = dynamodb.Table('Loyalty_One')

data = pd.read_csv('testdb.csv')
	
for i, rows in data[['collector_key', 'sales']].iterrows():
	collector_key = data[['collector_key']].astype(float)
	sales = data[['sales']]
			
	print("Adding data:", collector_key, sales)
		
	table.put_item(
		Item={
			'collector_key': collector_key,
			'sales': sales,
           }
       )
 
It's not liking line 17 and I'm really puzzled as to how it should be formatted. The way I have it now gives the error:
code:
TypeError: Unsupported type "<class 'pandas.core.frame.DataFrame'>" for value "       collector_key
0      -1.000000e+00
1      -1.000000e+00
...              ...
9972   -1.000000e+00
9973    1.345490e+11
[20000 rows x 1 columns]"
I've tried

code:
sales = float(data['sales'])
as well as

code:
sales = data[['sales']].astype(float)
It sees the data type for 'sales' as being "float," though, so I shouldn't need to convert it.

edit: looking over the full error message, it looks like boto3 is the one doing the complaining. I want to think that maybe boto3 and pandas don't get along, but as far as I know all boto3 sees is numbers being handed to it.

vikingstrike
Sep 23, 2007

whats happening, captain
Look at this loop and see if you can spot where you're tripping up:

code:
for i, rows in data[['collector_key', 'sales']].iterrows():
	collector_key = data[['collector_key']].astype(float)
	sales = data[['sales']]
			
	print("Adding data:", collector_key, sales)
		
	table.put_item(
		Item={
			'collector_key': collector_key,
			'sales': sales,
           }
       )
 

Seventh Arrow
Jan 26, 2005

I thought it was that last comma, but I removed it and no dice.

Does the placement of the last brackets matter?

vikingstrike
Sep 23, 2007

whats happening, captain
Nope. It has to do with how you are first assigning collector key and sales.

Seventh Arrow
Jan 26, 2005

The [['collector_key', 'sales']] seemed superfluous, so I removed it:

code:
for i, row in data.iterrows():
	collector_key = data[['collector_key']].astype(float)
	sales = data[['sales']]
Still errors, though. Admittedly I'm just looking through google for other examples of iterrows code.

vikingstrike
Sep 23, 2007

whats happening, captain
You aren't using the row data, you are using the original DataFrame. This code

code:
collector_key = data[['collector_key']].astype(float)
Should reference row, not data. Same for sales. When you write data['collector_key'], you are passing a pandas Series, not a scalar value.

Seventh Arrow
Jan 26, 2005

Thank you greatly. I think I was on the right track because I came across this page:

https://www.analyticsvidhya.com/blog/2016/01/12-pandas-techniques-python-data-manipulation/

and it showed rows being referenced. I was thrown off by the formatting though. I think it needs to go something like this:

code:
for index, row in data.iterrows():
	collector_key = (row['collector_key']).astype(float)
	sales = (row['sales']).astype(float)
I can't verify yet though because apparently DynamoDB doesn't accept float numbers, only int and decimal and I don't think ".astype" lets you use decimal.

It doesn't matter that much, though...the exercise only requires me to dump this stuff into the database so I guess I'll put it through as a string type.

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

Sad Panda posted:

Python question, how do you deal with doing something twice? I'm making a game of self-playing Blackjack and have the following...

Python code:
    def initial_deal(self):
        ''' Initial deal.'''
        new_deck.add_next_card_to_hand(self.players_hand.hand)
        new_deck.add_next_card_to_hand(self.players_hand.hand)
        new_deck.add_next_card_to_hand(self.dealers_hand.hand)
        new_deck.add_next_card_to_hand(self.dealers_hand.hand)
That seems the simplest way of doing it, and the player will never get more than 2 hands during the initial deal. I might change it so the dealer can get one or two one day, but that seems by far the easiest way to do it rather than a loop to do it twice.

I would probably do this in your case

Python code:
    def initial_deal(self):
        ''' Initial deal.'''
        for _ in range(4):
            new_deck.add_next_card_to_hand(self.players_hand.hand)
But honestly if I had to deal two cards I'd probably just copy and paste the line and have it twice because it's less effort.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

The loop protects you from the universe's ironic sense of humour

Adbot
ADBOT LOVES YOU

Sad Panda
Sep 22, 2004

I'm a Sad Panda.
Next part of my Blackjack program. A lookup table. A short extract of the data would be...


code:
        2   3   4   5   6   7   8   9   T   A
Hard 5  H   H   H   H   H   H   H   H   H   H
Hard 6  H   H   H   H   H   H   H   H   H   H
Hard 7  H   H   H   H   H   H   H   H   H   H
Hard 8  H   H   H   H   H   H   H   H   H   H
Hard 9  H   D   D   D   D   H   H   H   H   H
Hard 10 D   D   D   D   D   D   D   D   H   H
I want to be able to input 2, Hard 6 and it return H.

My original idea was 2D arrays, but that doesn't seem to support a column name which is what I'd call that 2/3/4/.. at the top. I found one solution, and he used a Pickled 'av table' (so the variable name suggests), but that seems a bit beyond me right now.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply