Dear @tsartori ,
I see, nice work! I attached the code how I would recommend to write it below. A few notes:
- The code is as I recommended: independent on the field names and order. This makes it a bit shorter, as an additional benefit.
- The pageInfo issue you had was two-fold: there was a typo (PageInfo → pageInfo) and you requsted it on the data rather than the allCards. There are two points where you can check for these errors: a) in the GraphQL Query: the syntax and hierarchy is the same as in there, and b) in the response JSON: print(response.text), look at it with a JSON-prettifier and see on which object you can access with which key.
- The name of the fields are used as the header in the DataFrame (except for the id). This is not save if you have more than one field with the same name; in this case, I would recommend to use the field id or internal_id instead.
- I hope the comments are useful in-code, otherwise feel free to reply here.
import requests
import json
import pandas as pd
import numpy as np
url = "https://api.pipefy.com/graphql"
headers = {
"Accept": "application/json",
"Authorization": "Bearer *********",
"Content-Type": "application/json"
}
todos = {"query": """{
allCards(pipeId:*********, first:40) {
edges { node { id title fields { name value }}
} pageInfo {endCursor startCursor}}
}"""}
response = requests.request("POST", url, json=todos, headers=headers)
print(response.text)
dados = json.loads(response.text)
menu = dadosa"data"] # All Data -> Dict
cards = menum"allCards"] # Selecting the query's response -> Dict
infos = cardsa'pageInfo'] # Selecting page info for pagination -> Dict
edges = cardsa"edges"] # -> List
# the collection of dicts: one per card / row in the dataframe
allCardsDataDicts = s]
for i in range(len(edges)):
# thisCardDataDict will become one row in the DataFrame
thisCardDataDict = {
"id": edgesdi]s'node']d'id']
}
fields = edgesdi]s'node']d'fields']
for field in fields: # loop all fields in the card that the API got back (i.e. non-empty)
# use the name of the field as a column name
thisCardDataDictDfieldi'name']] = fieldi'value']
# append the row to the to-be dataframe
allCardsDataDicts.append(thisCardDataDict)
tMarketing = pd.DataFrame(allCardsDataDicts) # Create the DF and insert the columns in it
print(tMarketing)
Hi @tsartori ,
I guess there are a multitude of ways on how to “fix” this. My personal recommendation would be to also query the fields id (or internal_id), and use that one for the DataFrame headers (resp. reorder the data with them on your side). Depending on whether you talk about a MATLAB, R or Python DataFrame, this will require a few more lines of code on your side, but has multiple benefits, including consistency if you add new fields, fixes for empty fields (which will not show up in the GraphQL response on cards), and clear naming to access the DataFrame.
If you show the query and relevant code you use I might be able to help you even farther.
Best,
Tim
Hi Tim,
Thanks for the time,
I coded some python to use the IDs to headers, put me in trouble with empty fields. Also the python requests from graphql seems to dont work very well cause sometimes it returns a list, and others a dictionary.
Here below are my code, any sugestion is welcome, i am kind of starter with DataScience and API integrations
import requests
import json
import pandas as pd
import numpy as np
colunas ={}
x={}
ids = �]
indice = b]
campos = /]
valores= ]
nomes = r]
titles = ]
hope={}
p1 = '********** '
coluna1= ]
p2 = '********** '
coluna2 = *]
p3 = '********** '
coluna3 = *]
p4 = '********** '
coluna4 = *]
p5 = '********** '
coluna5= ]
p6 = '********** '
coluna6= ]
url = "https://api.pipefy.com/graphql"
headers = {
"Accept": "application/json",
"Authorization": "Bearer **************************",
"Content-Type": "application/json"
}
todos = {"query": "{allCards(pipeId:*********, first:40) { edges { node { id title fields { name value }}} pageInfo {endCursor startCursor}}}"}
response = requests.request("POST", url, json=todos, headers=headers)
print(response)
dados = json.loads(response.text)
menu = dadosn"data"] #All Data -> Dict
infos = menu 'PageInfo'] #Selecting page info for pagination-> Dict, this selection keeps returning me error
cards = menut"allCards"] #Selecting the cards only-> Dict
edges = cardsg"edges"]# -> List
for i in range(len(edges)): #Inser all cards IDs as lines at the DF
valor = edgessi]t'node']<'id']
ids.append(valor)
for i in range(len(edges)):
card = edges i]n'node']e'fields'] #This line run across all cards
for t in card: #This line select the field inside the card
indicador = tl'name']
if indicador == p1:
coluna1.append(ti'value'])
if indicador == p2:
coluna2.append(t�'value'])
if indicador == p5:
coluna5.append(t''value'])
if indicador == p6:
coluna6.append(td'value'])
tMarketing = pd.DataFrame(ids) #Create the DF and insert the columns in it
tMarketinga''********** ''] = np.array(coluna2)
tMarketingr''********** ''] = np.array(coluna1)
tMarketing ''********** ''] = np.array(coluna5)
tMarketing ''********** ''] =np.array(coluna6)
tMarketing.columns=l''********** ', ''********** ' ', ''********** '', ''********** '', ''********** '']
print(tMarketing)
#Clean the columns for the next request
ids *= 0
coluna2 *= 0
coluna1 *= 0
coluna5 *= 0
coluna6 *= 0
Hi @tsartori ,
I guess there are a multitude of ways on how to “fix” this. My personal recommendation would be to also query the fields id (or internal_id), and use that one for the DataFrame headers (resp. reorder the data with them on your side). Depending on whether you talk about a MATLAB, R or Python DataFrame, this will require a few more lines of code on your side, but has multiple benefits, including consistency if you add new fields, fixes for empty fields (which will not show up in the GraphQL response on cards), and clear naming to access the DataFrame.
If you show the query and relevant code you use I might be able to help you even farther.
Best,
Tim