#StackBounty: #postgresql #geometry #postgresql-9.6 Query using subset of path in Postgres

Bounty: 50

Given this table:

 id |            points (path)                 |
----+------------------------------------------+
  1 | ((1,2),(3,4),(5,6),(7,8))                |

Is it possible to achieve the following using a single geometric operator and a path argument (sequential subset of the containing path), like ((3,4),(5,6))?

select * from things where points @> '(3,4)' and points @> '(5,6)';


Get this bounty!!!

#StackBounty: #postgresql #lua #openresty Connecting to postgresql from lapis

Bounty: 50

I decided to play with lapis – https://github.com/leafo/lapis, but the application drops when I try to query the database (PostgreSQL) with the output:

2017/07/01 16:04:26 [error] 31284#0: *8 lua entry thread aborted: runtime error: attempt to yield across C-call boundary
stack traceback:
coroutine 0:
[C]: in function ‘require’
/usr/local/share/lua/5.1/lapis/init.lua:15: in function ‘serve’
content_by_lua(nginx.conf.compiled:22):2: in function , client: 127.0.0.1, server: , request: “GET / HTTP/1.1”, host: “localhost:8080”

The code that causes the error:

local db = require("lapis.db")
local res = db.query("SELECT * FROM users");

config.lua:

config({ "development", "production" }, {
    postgres = {
        host = "0.0.0.0",
        port = "5432",
        user = "wars_base",
        password = "12345",
        database = "wars_base"
    }
})

The database is running, the table is created, in table 1 there is a record.

What could be the problem?


Get this bounty!!!

#StackBounty: #python #postgresql #dataset Best way to access and close a postgres database using python dataset

Bounty: 300

import dataset    
from sqlalchemy.pool import NullPool

db = dataset.connect(path_database, engine_kwargs={'poolclass': NullPool})

table_f1 = db['name_table']
# Do operations on table_f1

db.commit()
db.executable.close()

I use this code to access a postgres database and sometimes write to it. Finally, I close it. Is the above code the best way to access and close it? Alternatively, is the code below better?

import dataset    
from sqlalchemy.pool import NullPool

with dataset.connect(path_database, engine_kwargs={'poolclass': NullPool}) as db:
    table_f1 = db['name_table']
    # Do operations on table_f1

    db.commit()

In particular, I want to make 100% sure that there is no connection to the postgres database once this piece of code is done. Which is the better way to achieve it? option 1 or option 2?


Get this bounty!!!

#StackBounty: #ruby-on-rails #postgresql #activerecord Commit a nested transaction

Bounty: 50

Let’s say I have a method that provides access to an API client in the scope of a user and the API client will automatically update the users OAuth tokens when they expire.

class User < ActiveRecord::Base

  def api
    ApiClient.new access_token: oauth_access_token,
                  refresh_token: oauth_refresh_token,
                  on_oauth_refresh: -> (tokens) {
                    # This proc will be called by the API client when an 
                    # OAuth refresh occurs
                    update_attributes({
                      oauth_access_token: tokens[:access_token],
                      oauth_refresh_token: tokens[:refresh_token]
                     })
                   }
  end

end

If I consume this API within a Rails transaction and a refresh occurs and then an error occurs – I can’t persist the new OAuth tokens (because the proc above is also treated as part of the transaction):

u = User.first

User.transaction { 
  local_info = Info.create!

  # My tokens are expired so the client automatically
  # refreshes them and calls the proc that updates them locally.
  external_info = u.api.get_external_info(local_info.id)

  # Now when I try to locally save the info returned by the API an exception
  # occurs (for example due to validation). This rolls back the entire 
  # transaction (including the update of the user's new tokens.)
  local_info.info = external_info 
  local_info.save!
}

I’m simplifying the example but basically the consuming of the API and the persistence of data returned by the API need to happen within a transaction. How can I ensure the update to the user’s tokens gets committed even if the parent transaction fails.


Get this bounty!!!

#StackBounty: #postgresql #vacuum #postgresql-8.4 delete hung soon after vacuum verbose analyse

Bounty: 50

I have a fairly large java webapp system that connects to a postgres8.4 db on rhel.
This app only does inserts and reads.

This has been running without issue for 2.5 years and for the last year or so I implemented a data deletion script which deletes up to 5000 parent records and the corresponding child data in 3 tables.

This deletion script is called from a cron job every 5 minutes. It has been working flawlessly for over a year. This script always takes about 1-3 seconds to complete.

Also called from cron is vacuum verbose analyse script which is just called once a day (a few minutes before a deletion). This also has been working flawlessly for over a year. This takes about 15 minutes to complete.

Now last weekend (Saturday), the vacuum kicked off at 14:03, the deletion kicked off at 14:07 and did not complete. The next deletion kicked off at 14:12 and encountered a deadlock. The webapp at 14:14 hung. At 14:16 everything resumed as per normal and has been running fine since.

Now

To add more confusion, this server has an almost identical setup (standby sever) however the vacumm cron is due to run at 02:03 in the morning. When the vacuum kicked off on Sunday at 02:03, the same situation as above was encountered and at 02:14 the java app hung resuming at 02:15.

More

To further confuse me:
every time I go to the site I takes reading of df – this is always around 40% but after this happened, the df now reports 5% less

Any ideas? Please let me know if I have left out any relevant information.

edit

This is the Postgresql-Sat.log

WARNING:  skipping "pg_authid" --- only superuser can vacuum it
WARNING:  skipping "pg_database" --- only superuser can vacuum it
WARNING:  skipping "pg_tablespace" --- only superuser can vacuum it
WARNING:  skipping "pg_pltemplate" --- only superuser can vacuum it
WARNING:  skipping "pg_shdepend" --- only superuser can vacuum it
WARNING:  skipping "pg_shdescription" --- only superuser can vacuum it
WARNING:  skipping "pg_auth_members" --- only superuser can vacuum it
ERROR:  deadlock detected
DETAIL:  Process 12729 waits for ShareLock on transaction 91311423; blocked by process 12800.
    Process 12800 waits for ShareLock on transaction 91311422; blocked by process 12729.
    Process 12729: delete from child1 where id in
    (
    select id from parent where date_collected < now() - interval '13 months' order by id limit 5000
    );
    Process 12800: delete from child1 where id in
    (
    select id from parent where date_collected < now() - interval '13 months' order by id limit 5000
    );


Get this bounty!!!

#StackBounty: #database #cloud-service #synchronization #postgresql Sync local postgres db to a cloud postgres db, as a Windows Service

Bounty: 100

I’m looking for a software that will sync a local postgres db to a db in the cloud, like what Tableau can do with this service.

This software has to be installed like a Windows service. It can be installed in any Windows machine.

The data will be pulled from the local db and refreshed to a postgres db, hosted in the cloud.

The way is always local to remote and the refreshing period is within a frame of a few minutes.

The service and the cloud host can be the same provider. Actually we are looking for a solution that will be part of the same ecosystem.

Thanks


Get this bounty!!!

#StackBounty: #database #cloud-service #synchronization #postgresql Software installing a windows local service sync postgres local db …

Bounty: 100

I’m looking for a software that will sync a local postgres db to a db in the cloud, like what Tableau can do with this service.

This software has to be installed like a Windows service. It can be installed in any Windows machine.

The data will be pulled from the local db and refreshed to a postgres db, hosted in the cloud.

The service and the cloud host can be the same provider. Actually we are looking for a solution that will be part of the same ecosystem.

Thanks


Get this bounty!!!

#StackBounty: #csv #postgresql #r #sql How to select on CSV files like SQL in R?

Bounty: 50

I know the thread How can I inner join two csv files in R which has a merge option, which I do not want.
I have two data CSV files. I am thinking how to query like them like SQL with R.
I really like PostgreSQL so I think it would work here great or similar syntax tools of R.
Two CSV files where primary key is data_id.

data.csv where OK to have IDs not found in log.csv (etc 4)

data_id, event_value
1, 777
1, 666
2, 111
4, 123 
3, 324
1, 245

log.csv where no duplicates in ID column but duplicates can be in name

data_id, name
1, leo
2, leopold
3, lorem

Pseudocode by partial PostgreSQL syntax

  1. Let data_id=1
  2. Show name and event_value from data.csv and log.csv, respectively

Pseudocode like partial PostgreSQL select

SELECT name, event_value 
    FROM data, log
    WHERE data_id=1;

Expected output

leo, 777
leo, 666 
leo, 245

R approach

file1 <- read.table("file1.csv", col.names=c("data_id", "event_value"))
file2 <- read.table("file2.csv", col.names=c("data_id", "name"))

# TODO here something like the SQL query 
# http://stackoverflow.com/a/1307824/54964

Possible approaches where I think sqldf can be sufficient here

  1. sqldf
  2. data.table
  3. dplyr
  4. PostgreSQL database

PostgreSQL thoughts

Schema

DROP TABLE IF EXISTS data, log;    
CREATE TABLE data (
        data_id SERIAL PRIMARY KEY NOT NULL,
        event_value INTEGER NOT NULL
);
CREATE TABLE log (
        data_id SERIAL PRIMARY KEY NOT NULL,
        name INTEGER NOT NULL
);

R: 3.3.3
OS: Debian 8.7


Get this bounty!!!

#StackBounty: #postgresql #indexing #database-performance #postgresql-performance #postgresql-9.6 Slow nested loop left join with index…

Bounty: 100

I am really struggling to optimize my query…

So here is the query:

SELECT wins / (wins + COUNT(loosers.match_id) + 0.) winrate, wins + COUNT(loosers.match_id) matches, winners.winning_champion_one_id, winners.winning_champion_two_id, winners.winning_champion_three_id, winners.winning_champion_four_id, winners.winning_champion_five_id FROM
(
   SELECT COUNT(match_id) wins, winning_champion_one_id, winning_champion_two_id, winning_champion_three_id, winning_champion_four_id, winning_champion_five_id FROM matches
   WHERE
      157 IN (winning_champion_one_id, winning_champion_two_id, winning_champion_three_id, winning_champion_four_id, winning_champion_five_id)
   GROUP BY winning_champion_one_id, winning_champion_two_id, winning_champion_three_id, winning_champion_four_id, winning_champion_five_id
) winners
LEFT OUTER JOIN matches loosers ON
  winners.winning_champion_one_id = loosers.loosing_champion_one_id AND
  winners.winning_champion_two_id = loosers.loosing_champion_two_id AND
  winners.winning_champion_three_id = loosers.loosing_champion_three_id AND
  winners.winning_champion_four_id = loosers.loosing_champion_four_id AND
  winners.winning_champion_five_id = loosers.loosing_champion_five_id
GROUP BY winners.wins, winners.winning_champion_one_id, winners.winning_champion_two_id, winners.winning_champion_three_id, winners.winning_champion_four_id, winners.winning_champion_five_id
HAVING wins + COUNT(loosers.match_id) >= 20
ORDER BY winrate DESC, matches DESC
LIMIT 1;

And this is the output of EXPLAIN (BUFFERS, ANALYZE):

Limit  (cost=72808.80..72808.80 rows=1 width=58) (actual time=1478.749..1478.749 rows=1 loops=1)
  Buffers: shared hit=457002
  ->  Sort  (cost=72808.80..72837.64 rows=11535 width=58) (actual time=1478.747..1478.747 rows=1 loops=1)
"        Sort Key: ((((count(matches.match_id)))::numeric / ((((count(matches.match_id)) + count(loosers.match_id)))::numeric + '0'::numeric))) DESC, (((count(matches.match_id)) + count(loosers.match_id))) DESC"
        Sort Method: top-N heapsort  Memory: 25kB
        Buffers: shared hit=457002
        ->  HashAggregate  (cost=72462.75..72751.12 rows=11535 width=58) (actual time=1448.941..1478.643 rows=83 loops=1)
"              Group Key: (count(matches.match_id)), matches.winning_champion_one_id, matches.winning_champion_two_id, matches.winning_champion_three_id, matches.winning_champion_four_id, matches.winning_champion_five_id"
              Filter: (((count(matches.match_id)) + count(loosers.match_id)) >= 20)
              Rows Removed by Filter: 129131
              Buffers: shared hit=457002
              ->  Nested Loop Left Join  (cost=9857.76..69867.33 rows=115352 width=26) (actual time=288.086..1309.687 rows=146610 loops=1)
                    Buffers: shared hit=457002
                    ->  HashAggregate  (cost=9857.33..11010.85 rows=115352 width=18) (actual time=288.056..408.317 rows=129214 loops=1)
"                          Group Key: matches.winning_champion_one_id, matches.winning_champion_two_id, matches.winning_champion_three_id, matches.winning_champion_four_id, matches.winning_champion_five_id"
                          Buffers: shared hit=22174
                          ->  Bitmap Heap Scan on matches  (cost=1533.34..7455.69 rows=160109 width=18) (actual time=26.618..132.844 rows=161094 loops=1)
                                Recheck Cond: ((157 = winning_champion_one_id) OR (157 = winning_champion_two_id) OR (157 = winning_champion_three_id) OR (157 = winning_champion_four_id) OR (157 = winning_champion_five_id))
                                Heap Blocks: exact=21594
                                Buffers: shared hit=22174
                                ->  BitmapOr  (cost=1533.34..1533.34 rows=164260 width=0) (actual time=22.190..22.190 rows=0 loops=1)
                                      Buffers: shared hit=580
                                      ->  Bitmap Index Scan on matches_winning_champion_one_id_index  (cost=0.00..35.03 rows=4267 width=0) (actual time=0.045..0.045 rows=117 loops=1)
                                            Index Cond: (157 = winning_champion_one_id)
                                            Buffers: shared hit=3
                                      ->  Bitmap Index Scan on matches_winning_champion_two_id_index  (cost=0.00..47.22 rows=5772 width=0) (actual time=0.665..0.665 rows=3010 loops=1)
                                            Index Cond: (157 = winning_champion_two_id)
                                            Buffers: shared hit=13
                                      ->  Bitmap Index Scan on matches_winning_champion_three_id_index  (cost=0.00..185.53 rows=22840 width=0) (actual time=3.824..3.824 rows=23893 loops=1)
                                            Index Cond: (157 = winning_champion_three_id)
                                            Buffers: shared hit=89
                                      ->  Bitmap Index Scan on matches_winning_champion_four_id_index  (cost=0.00..537.26 rows=66257 width=0) (actual time=8.069..8.069 rows=67255 loops=1)
                                            Index Cond: (157 = winning_champion_four_id)
                                            Buffers: shared hit=244
                                      ->  Bitmap Index Scan on matches_winning_champion_five_id_index  (cost=0.00..528.17 rows=65125 width=0) (actual time=9.577..9.577 rows=67202 loops=1)
                                            Index Cond: (157 = winning_champion_five_id)
                                            Buffers: shared hit=231
                    ->  Index Scan using matches_loosing_champion_ids_index on matches loosers  (cost=0.43..0.49 rows=1 width=18) (actual time=0.006..0.006 rows=0 loops=129214)
                          Index Cond: ((matches.winning_champion_one_id = loosing_champion_one_id) AND (matches.winning_champion_two_id = loosing_champion_two_id) AND (matches.winning_champion_three_id = loosing_champion_three_id) AND (matches.winning_champion_four_id = loosing_champion_four_id) AND (matches.winning_champion_five_id = loosing_champion_five_id))
                          Buffers: shared hit=434828
Planning time: 0.584 ms
Execution time: 1479.779 ms

This is the DDL:

create table matches
(
    match_id bigint not null,
    winning_champion_one_id smallint,
    winning_champion_two_id smallint,
    winning_champion_three_id smallint,
    winning_champion_four_id smallint,
    winning_champion_five_id smallint,
    loosing_champion_one_id smallint,
    loosing_champion_two_id smallint,
    loosing_champion_three_id smallint,
    loosing_champion_four_id smallint,
    loosing_champion_five_id smallint,
    constraint matches_match_id_pk
        primary key (match_id)
)
;

create index matches_winning_champion_one_id_index
    on matches (winning_champion_one_id)
;

create index matches_winning_champion_two_id_index
    on matches (winning_champion_two_id)
;

create index matches_winning_champion_three_id_index
    on matches (winning_champion_three_id)
;

create index matches_winning_champion_four_id_index
    on matches (winning_champion_four_id)
;

create index matches_winning_champion_five_id_index
    on matches (winning_champion_five_id)
;

create index matches_loosing_champion_ids_index
    on matches (loosing_champion_one_id, loosing_champion_two_id, loosing_champion_three_id, loosing_champion_four_id, loosing_champion_five_id)
;

create index matches_loosing_champion_one_id_index
    on matches (loosing_champion_one_id)
;

create index matches_loosing_champion_two_id_index
    on matches (loosing_champion_two_id)
;

create index matches_loosing_champion_three_id_index
    on matches (loosing_champion_three_id)
;

create index matches_loosing_champion_four_id_index
    on matches (loosing_champion_four_id)
;

create index matches_loosing_champion_five_id_index
    on matches (loosing_champion_five_id)
;

There is probably something I do overlook. Any help is appreciated.

The table can have 100m+ rows. At the moment it does have about 20m rows.

This are the only changes I made to the postgresql.conf:

max_connections = 50
shared_buffers = 6GB
effective_cache_size = 18GB
work_mem = 125829kB
maintenance_work_mem = 1536MB
min_wal_size = 1GB
max_wal_size = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
max_parallel_workers_per_gather = 8
min_parallel_relation_size = 1

Here you can see table/index sizes of matches currently:

public.matches,2331648 rows
public.matches,197 MB
public.matches_riot_match_id_pk,153 MB
public.matches_loosing_champion_ids_index,136 MB
public.matches_loosing_champion_four_id_index,113 MB
public.matches_loosing_champion_five_id_index,113 MB
public.matches_winning_champion_one_id_index,113 MB
public.matches_winning_champion_five_id_index,113 MB
public.matches_winning_champion_three_id_index,112 MB
public.matches_loosing_champion_three_id_index,112 MB
public.matches_winning_champion_four_id_index,112 MB
public.matches_loosing_champion_one_id_index,112 MB
public.matches_winning_champion_two_id_index,112 MB
public.matches_loosing_champion_two_id_index,112 MB


Get this bounty!!!

#StackBounty: #postgresql #pgadmin-4 How to set connection timeout for pgAdmin 4 (postgres 9.6)

Bounty: 100

I have read the article listed here:

How to set connection timeout value for pgAdmin?

many times, but I still have no idea where one sets the config parameter for connection_timeout. I am connecting from a local host to a local host, so there should be no real problems with keep alives.

I am using PgAdmin 4, PostGreSQL 9.6 on windows 10.

I would like to know the path to:

  • Is this something that has to be set in the server, (e.g. postgresql.conf) and what is the value to set?
  • Or is this something in the client (and where is the client config and what has to be set?)
  • Or is this a bug in PGADMIN
  • Or something else

I see others are also confused:

Is there a timeout option for remote access to PostgreSQL database?


Get this bounty!!!