## #StackBounty: #matlab #random Transforming draws in Matlab from Gaussian mixture to uniform

### Bounty: 100

Consider the following draws for a `2x1` vector in Matlab with a probability distribution that is a mixture of two Gaussian components.

``````P=10^3; %number draws
v=0.038462;

%First component
mu_a = [0,0.2806];
sigma_a = [v,0;0,v];

%Second component
mu_b = [0,-1.6806];
sigma_b = [v,0;0,v];

%Combine
MU = [mu_a;mu_b];
SIGMA = cat(3,sigma_a,sigma_b);
w = ones(1,2)/2; %equal weight 0.5
obj = gmdistribution(MU,SIGMA,w);

%Draws
RV_temp = random(obj,P);%Px2

% Transform each component of RV_temp into a uniform in [0,1] by estimating the cdf.
RV1=ksdensity(RV_temp(:,1), RV_temp(:,1),'function', 'cdf');
RV2=ksdensity(RV_temp(:,2), RV_temp(:,2),'function', 'cdf');
``````

Now, if we check whether `RV1` and `RV2` are uniformly distributed on `[0,1]` by doing

``````ecdf(RV1)
ecdf(RV2)
``````

we can see that `RV1` is uniformly distributed on `[0,1]` (the empirical cdf is close to the 45 degree line) while `RV2` is not.

Could you help me to understand why?

Get this bounty!!!

## #StackBounty: #python #python-2.7 #random #steganography Python PRNG_Steganography lsb method

### Bounty: 50

This is my implementation of a PRNG_Steganography tool for Python. You can also find the code on GitHub.

Imports

``````from PIL import Image
import numpy as np
import sys
import os
import getopt
import base64
import random
from cryptography.fernet import Fernet
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend

# Set location of directory we are working in to load/save files
__location__ = os.path.realpath(os.path.join(os.getcwd(), os.path.dirname(__file__)))
``````

Encryption and decryption of the text methods via the Cryptography module

``````def get_key(password):
digest = hashes.Hash(hashes.SHA256(), backend=default_backend())
return base64.urlsafe_b64encode(digest.finalize())

return f.encrypt(bytes(token))

return f.decrypt(bytes(token))
``````

Main encryption method

``````def encrypt(filename, text, magic):
# check whether the text is a file name
if len(text.split('.')[1:]):
t = [int(x) for x in ''.join(text_ascii(encrypt_text(magic, text)))] + [0]*7  # endbit
try:
# Change format to png
filename = change_image_form(filename)

# Check if image can contain the data
if d_old.size < len(t):
print '[*] Image not big enough'
sys.exit(0)

# get new data and save to image
d_new = encrypt_lsb(d_old, magic, t)
save_image(d_new, 'new_'+filename)
except Exception, e:
print str(e)
``````

Main decryption method

``````def decrypt(filename, magic):
try:

# Retrieve text
text = decrypt_lsb(data, magic)
print '[*] Retrieved text: n%s' % decrypt_text(magic, text)
except Exception, e:
print str(e)
``````

Random methods

``````def text_ascii(text):
return map(lambda char: '{:07b}'.format(ord(char)), text)

def ascii_text(byte_char):
return chr(int(byte_char, 2))

def next_random(random_list, data):
next_random_number = random.randint(0, data.size-1)
while next_random_number in random_list:
next_random_number = random.randint(0, data.size-1)
return next_random_number

def generate_seed(magic):
seed = 1
for char in magic:
seed *= ord(char)
print '[*] Your magic number is %d' % seed
return seed
``````

Encrypt via lsb_method

``````def encrypt_lsb(data, magic, text):
print '[*] Starting Encryption'

# We must alter the seed but for now lets make it simple
random.seed(generate_seed(magic))

random_list = []
for i in range(len(text)):
next_random_number = next_random(random_list, data)
random_list.append(next_random_number)
data.flat[next_random_number] = (data.flat[next_random_number] & ~1) | text[i]

print '[*] Finished Encryption'
return data
``````

Decrypt via lsb_method

``````def decrypt_lsb(data, magic):
print '[*] Starting Decryption'
random.seed(generate_seed(magic))

random_list = []
output = temp_char = ''

for i in range(data.size):
next_random_number = next_random(random_list, data)
random_list.append(next_random_number)
temp_char += str(data.flat[next_random_number] & 1)
if len(temp_char) == 7:
if int(temp_char) > 0:
output += ascii_text(temp_char)
temp_char = ''
else:
print '[*] Finished Decryption'
return output
``````

Image handling methods

``````def load_image(filename):
img = Image.open(os.path.join(__location__, filename))
data = np.asarray(img, dtype="int32")
return data

def save_image(npdata, outfilename):
img = Image.fromarray(np.asarray(np.clip(npdata, 0, 255), dtype="uint8"), "RGB")
img.save(os.path.join(__location__, outfilename))

def change_image_form(filename):
f = filename.split('.')
if not (f[-1] == 'bmp' or f[-1] == 'BMP' or f[-1] == 'PNG' or f[-1] == 'png'):
img = Image.open(os.path.join(__location__, filename))
filename = ''.join(f[:-1]) + '.png'
img.save(os.path.join(__location__, filename))
return filename

if os.path.exists(filename):
with open(filename, 'r') as f:
return ''.join([i for i in f])
return filename.replace(__location__, '')[1:]
``````

Usage and main

``````def usage():
print "Steganography prng-Tool @Ludisposed & @Qin"
print ""
print "Usage: prng_stego.py -e -m magic filename text "
print "-e --encrypt              - encrypt filename with text"
print "-d --decrypt              - decrypt filename"
print "-m --magic                - encrypt/decrypt with password"
print ""
print ""
print "Examples: "
print "prng_stego.py -e -m pass test.png howareyou"
print 'python prng_stego.py -e -m magic test.png tester.sh'
print 'python prng_stego.py -e -m magic test.png file_test.txt'
print 'prng_stego.py --encrypt --magic password test.png "howareyou  some other text"'
print ''
print "prng_stego.py -d -m password test.png"
print "prng_stego.py --decrypt --magic password test.png"
sys.exit(0)

if __name__ == "__main__":
if not len(sys.argv[1:]):
usage()
try:
opts, args = getopt.getopt(sys.argv[1:], "hedm:", ["help", "encrypt", "decrypt", "magic="])
except getopt.GetoptError as err:
print str(err)
usage()

magic = to_encrypt = None
for o, a in opts:
if o in ("-h", "--help"):
usage()
elif o in ("-e", "--encrypt"):
to_encrypt = True
elif o in ("-d", "--decrypt"):
to_encrypt = False
elif o in ("-m", "--magic"):
magic = a
else:
assert False, "Unhandled Option"

if magic is None or to_encrypt is None:
usage()

if not to_encrypt:
filename = args[0]
decrypt(filename, magic)
else:
filename = args[0]
text = args[1]
encrypt(filename, text, magic)
``````

Any general Coding tips are welcome!

I would also love some tips on how the use the `magic` to create a wierd seed.

It works by seeding the random module and getting the next unique random integer with that seed. At decryption it knows when to stop looking for bits if it finds the endbit `[0]*7`

Furthermore I’m interested in how Random this is? I think this is harder to decrypt then just normal lsb_stego, but can not prove anything.

Kind Regards: Ludisposed

Get this bounty!!!

## Best way to select random rows PostgreSQL

Given, you have a very large table with 500 Million rows, and you have to select some random 1000 rows out of the table and you want it to be fast.

Given the specifications:

• You assumed to have a numeric ID column (integer numbers) with only few (or moderately few) gaps.
• Ideally no or few write operations.
• Your ID column should have been indexed! A primary key serves nicely.

The query below does not need a sequential scan of the big table, only an index scan.

First, get estimates for the main query:

``````SELECT count(*) AS ct              -- optional
, min(id)  AS min_id
, max(id)  AS max_id
, max(id) - min(id) AS id_span
FROM   big;``````

The only possibly expensive part is the `count(*)` (for huge tables). You will get an estimate, available at almost no cost (detailed explanation here):

``SELECT reltuples AS ct FROM pg_class WHERE oid = 'schema_name.big'::regclass;``

As long as `ct` isn’t much smaller than `id_span`, the query will outperform most other approaches.

``````WITH params AS (
SELECT 1       AS min_id           -- minimum id <= current min id
, 5100000 AS id_span          -- rounded up. (max_id - min_id + buffer)
)
SELECT *
FROM  (
SELECT p.min_id + trunc(random() * p.id_span)::integer AS id
FROM   params p
,generate_series(1, 1100) g  -- 1000 + buffer
GROUP  BY 1                        -- trim duplicates
) r
JOIN   big USING (id)
LIMIT  1000;                           -- trim surplus``````
• Generate random numbers in the `id` space. You have “few gaps”, so add 10 % (enough to easily cover the blanks) to the number of rows to retrieve.
• Each `id` can be picked multiple times by chance (though very unlikely with a big id space), so group the generated numbers (or use `DISTINCT`).
• Join the `id`s to the big table. This should be very fast with the index in place.
• Finally trim surplus `id`s that have not been eaten by dupes and gaps. Every row has a completely equal chance to be picked.

### Short version

You can simplify this query. The CTE in the query above is just for educational purposes:

``````SELECT *
FROM  (
SELECT DISTINCT 1 + trunc(random() * 5100000)::integer AS id
FROM   generate_series(1, 1100) g
) r
JOIN   big USING (id)
LIMIT  1000;``````

## Refine with rCTE

Especially if you are not so sure about gaps and estimates.

``````WITH RECURSIVE random_pick AS (
SELECT *
FROM  (
SELECT 1 + trunc(random() * 5100000)::int AS id
FROM   generate_series(1, 1030)  -- 1000 + few percent - adapt to your needs
LIMIT  1030                      -- hint for query planner
) r
JOIN   big b USING (id)             -- eliminate miss

UNION                               -- eliminate dupe
SELECT b.*
FROM  (
SELECT 1 + trunc(random() * 5100000)::int AS id
FROM   random_pick r             -- plus 3 percent - adapt to your needs
LIMIT  999                       -- less than 1000, hint for query planner
) r
JOIN   big b USING (id)             -- eliminate miss
)
SELECT *
FROM   random_pick
LIMIT  1000;  -- actual limit``````

We can work with a smaller surplus in the base query. If there are too many gaps so we don’t find enough rows in the first iteration, the rCTE continues to iterate with the recursive term. We still need relatively few gaps in the ID space or the recursion may run dry before the limit is reached – or we have to start with a large enough buffer which defies the purpose of optimizing performance.

Duplicates are eliminated by the `UNION` in the rCTE.

The outer `LIMIT` makes the CTE stop as soon as we have enough rows.

This query is carefully drafted to use the available index, generate actually random rows and not stop until we fulfill the limit (unless the recursion runs dry). There are a number of pitfalls here if you are going to rewrite it.

## Wrap into function

For repeated use with varying parameters:

``````CREATE OR REPLACE FUNCTION f_random_sample(_limit int = 1000, _gaps real = 1.03)
RETURNS SETOF big AS
\$func\$
DECLARE
_surplus  int := _limit * _gaps;
_estimate int := (           -- get current estimate from system
SELECT c.reltuples * _gaps
FROM   pg_class c
WHERE  c.oid = 'big'::regclass);
BEGIN

RETURN QUERY
WITH RECURSIVE random_pick AS (
SELECT *
FROM  (
SELECT 1 + trunc(random() * _estimate)::int
FROM   generate_series(1, _surplus) g
LIMIT  _surplus           -- hint for query planner
) r (id)
JOIN   big USING (id)        -- eliminate misses

UNION                        -- eliminate dupes
SELECT *
FROM  (
SELECT 1 + trunc(random() * _estimate)::int
FROM   random_pick        -- just to make it recursive
LIMIT  _limit             -- hint for query planner
) r (id)
JOIN   big USING (id)        -- eliminate misses
)
SELECT *
FROM   random_pick
LIMIT  _limit;
END
\$func\$  LANGUAGE plpgsql VOLATILE ROWS 1000;``````

Call:

``````SELECT * FROM f_random_sample();
SELECT * FROM f_random_sample(500, 1.05);``````

You could even make this generic to work for any table: Take the name of the PK column and the table as polymorphic type and use `EXECUTE` … But that’s beyond the scope of this post. See:

### Possible alternative

IF your requirements allow identical sets for repeated calls (and we are talking about repeated calls) I would consider a materialized view. Execute above query once and write the result to a table. Users get a quasi random selection at lightening speed. Refresh your random pick at intervals or events of your choosing.

## Postgres 9.5 introduces `TABLESAMPLE SYSTEM (n)`

It’s very fast, but the result is not exactly random. The manual:

The `SYSTEM` method is significantly faster than the `BERNOULLI` method when small sampling percentages are specified, but it may return a less-random sample of the table as a result of clustering effects.

And the number of rows returned can vary wildly. For our example, to get roughly 1000 rows, try:

``SELECT * FROM big TABLESAMPLE SYSTEM ((1000 * 100) / 5100000.0);``

Where n is a percentage. The manual:

The `BERNOULLI` and `SYSTEM` sampling methods each accept a single argument which is the fraction of the table to sample, expressed as a percentage between 0 and 100. This argument can be any `real`-valued expression.

Bold emphasis mine.

Related:

Source

## Simple way to generate a random password in PHP

When creating web apps, there’s often a need to generate a random password for your users. There are a number of ways to do this, but in needing to do it recently I came up with this very simple function that will generate a password (or other random string) of whatever length you wish. It’s particularly useful when generating passwords for users that they will then change in the future. It uses PHP’s handy `str_shuffle()` function:

```<?php
function random_password( \$length = 8 ) {
\$chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!@#\$%^&*()_-=+;:,.?";
\$index=rand(\$length,\$length*\$length);
`<?php \$password = random_password(8); ?>`