pandas => Google BigQuery의 IO

사용자 계정 자격증 명으로 BigQuery에서 데이터 읽기

In [1]: import pandas as pd

BigQuery에서 쿼리를 실행하려면 자신의 BigQuery 프로젝트가 있어야합니다. 공개 샘플 데이터를 요청할 수 있습니다.

In [2]: data = pd.read_gbq('''SELECT title, id, num_characters
   ...:                       FROM [publicdata:samples.wikipedia]
   ...:                       LIMIT 5'''
   ...:                    , project_id='<your-project-id>')

그러면 다음과 같이 인쇄됩니다.

Your browser has been opened to visit:

    https://accounts.google.com/o/oauth2/v2/auth...[looong url cutted]

If your browser is on a different machine then exit and re-run this
application with the command-line parameter

  --noauth_local_webserver

브라우저보다 로컬 컴퓨터에서 작동하는 경우 팝업됩니다. 권한을 부여한 후 팬더는 출력을 계속합니다.

Authentication successful.
Requesting query... ok.
Query running...
Query done.
Processed: 13.8 Gb

Retrieving results...
Got 5 rows.

Total time taken 1.5 s.
Finished at 2016-08-23 11:26:03.

결과:

In [3]: data
Out[3]: 
               title       id  num_characters
0       Fusidic acid   935328            1112
1     Clark Air Base   426241            8257
2  Watergate scandal    52382           25790
3               2005    35984           75813
4               .BLP  2664340            1659

부작용으로서 pandas는 json 파일 bigquery_credentials.dat 를 만들 것입니다.이 파일을 사용하면 더 이상 권한을 부여 할 필요없이 추가 쿼리를 실행할 수 있습니다.

In [9]: pd.read_gbq('SELECT count(1) cnt FROM [publicdata:samples.wikipedia]'
                   , project_id='<your-project-id>')
Requesting query... ok.
[rest of output cutted]

Out[9]: 
         cnt
0  313797035

BigQuery에서 서비스 계정 자격 증명으로 데이터 읽기

서비스 계정 을 만들고 개인 키 json 파일을 가지고 있다면이 파일을 사용하여 팬더를 인증 할 수 있습니다

In [5]: pd.read_gbq('''SELECT corpus, sum(word_count) words
                       FROM [bigquery-public-data:samples.shakespeare]       
                       GROUP BY corpus                                
                       ORDER BY words desc
                       LIMIT 5'''
                   , project_id='<your-project-id>'
                   , private_key='<private key json contents or file path>')
Requesting query... ok.
[rest of output cutted]

Out[5]: 
           corpus  words
0          hamlet  32446
1  kingrichardiii  31868
2      coriolanus  29535
3       cymbeline  29231
4    2kinghenryiv  28241

Modified text is an extract of the original Stack Overflow Documentation

아래 라이선스 CC BY-SA 3.0

와 제휴하지 않음 Stack Overflow

pandas
Google BigQuery의 IO

수색…

사용자 계정 자격증 명으로 BigQuery에서 데이터 읽기

BigQuery에서 서비스 계정 자격 증명으로 데이터 읽기