python - NTILE for Sqlite from Pandas gives OPERATIONAL ERROR -

i'm trying use ntile function querying sqlite database pandas, haven't succeeded, though i've rechecked syntax many times.

self-contained example below. setup:

import pandas pd sqlalchemy import create_engine disk_engine = create_engine('sqlite:///test.db')  marks = pd.dataframe({'studentid': ['s1', 's2', 's3', 's4', 's5'],                       'marks': [75, 83, 91, 83, 93]}) marks.to_sql('marks_sql', disk_engine, if_exists='replace')

now try use ntile:

q = """select studentid, marks, ntile(2) on (order marks desc)         groupexample marks_sql""" pd.read_sql_query(q, disk_engine)

the traceback long, it's main parts are:

operationalerror: near "(": syntax error operationalerror: (sqlite3.operationalerror) near "(": syntax error [sql: 'select studentid, marks, ntile(2) on (order marks desc)\n        groupexample marks_sql']

thanks!

there no ntile () over functionality in sqlite

gives me same error, need create using more complex query or functions

here list of unsupported analytical functions not available in sqlite

ntile 1 of these

the optimizer goes inside query first find over, thinks column name , not expect ( follow column name, gives error.

to replicate ntile try this:

select * , case    when      (select count(*)+0.0 marks_sql b table.marks >= b.marks)     /(select count(*) marks_sql ) >0.5    1    else 2 end marks_sql;

in order in such way table can grow in size , technique still applies have few things:

so first order table marks (essentially create ranking). counts rows higher or equal marks:

select count(*)+0.0 marks_sql b table.marks >= b.marks  --rank of mark

we add 0.0 make number float our fraction works in next step.

we take rank , divide total row count

select count(*) marks_sql -- row count

this gives distribution on range of scores, percentile each student. not care each exact percentile, care ntile(2) or whether in top half.

that case statement comes play. if percentile of student on 50% fall in #1 group, top 50th percentile. else falls in #2 group.

Search This Blog

Employment

python - NTILE for Sqlite from Pandas gives OPERATIONAL ERROR -

Popular posts from this blog

Apache NiFi ExecuteScript: Groovy script to replace Json values via a mapping file -

python 3.x - PyQt5 - Signal : pyqtSignal no method connect -

audio - What is the sound ID for the "Glass" sound in iOS? -