#StackBounty: #performance #beginner #php #file-system #api Equity data processing: Fast and/or efficient file writing using PHP

Bounty: 50

Problem

This is my first scripting project and I’m sure it has so many issues.

The main class, EQ, scrapes equities data using EquityRecords, calculates sector coefficients using
SectorMovers, estimates equity prices, and finally writes ~8,000 HTML strings (~100Kb-120Kb) on .md files for viewing.

I couldn’t add EQ code in the post due to character limitation of the posts. Please see the entire code on this GitHub link.


Performance

The only goal is making EQ as fast/efficient as possible for a single server. Would you be so kind and review it and kindly help me to reach this goal?


EquityRecords

date_default_timezone_set("UTC");
ini_set('max_execution_time', 0);
ini_set('memory_limit', '-1');
set_time_limit(0);

// EquityRecords::allEquitiesSignleJSON(new EquityRecords());

class EquityRecords
{

    const NUMBER_OF_STOCKS_PER_REQUEST = 100;
    const NEW_LINE = "n";

    /**
     *
     * @var a string of iextrading symbols
     */

    const SYMBOLS_PATH = '/../../config/z-iextrading-symbs.md';

    /**
     *
     * @var a string of our symbols json directory
     */

    const SYMBOLS_DIR = "/../../blog-back/equities/real-time-60sec/z-raw-equilibrium-estimation";

    /**
     *
     * @var a string of target path and query
     */

    const TARGET_QUERY = "stock/market/batch?symbols=";

    /**
     *
     * @var a string of iextrading base URL
     */

    const BASE_URL = "https://api.iextrading.com/1.0/";

    /**
     *
     * @var a string of iextrading end point
     */

    const END_POINT = "&types=quote,chart&range=1m&last=10";

    /**
     *
     * @var an integer for maximum number of stocks per URL on each call
     */

    //***************** A ********************** //
    // public static function getSymbols() {
    //     return array_map(function($line){ return str_getcsv($line, "t"); }, file(__DIR__ . self::SYMBOLS_PATH));
    // }

    public static function getSymbols()
    {

        //***************** START: ALL SYMBOLS ARRAY ********************** //
        // var: is a filename path directory, where there is an md file with list of equities
        $list_of_equities_file = __DIR__ . self::SYMBOLS_PATH;

        // var: is content of md file with list of equities
        $content_of_equities = file_get_contents($list_of_equities_file);

        // var is an array(3) of equities such as: string(4) "ZYNE", string(10) "2019-01-04", string(27) "ZYNERBA PHARMACEUTICALS INC"

        // $symbols_array=preg_split('/rn|r|n/', $content_of_equities);
        $symbols_array = preg_split('/R/', $content_of_equities);
        //***************** END: ALL SYMBOLS ARRAY ********************** //

        // child and mother arrays are created to help calling equities in batches of 100, which seems to be the API limit.
        $child = array();
        $mother = array();
        // var: is 100 counter
        $limit_counter = self::NUMBER_OF_STOCKS_PER_REQUEST;
        foreach ($symbols_array as $ticker_arr) {
            $limit_counter = $limit_counter - 1;
            $symbols_array = preg_split('/t/', $ticker_arr);
            array_push($child, $symbols_array);

            if ($limit_counter <= 0) {
                $limit_counter = self::NUMBER_OF_STOCKS_PER_REQUEST;
                array_push($mother, $child);
                $child = array();
            }

        }
        return $mother;
    }

    public static function allEquitiesSignleJSON()
    {
        $equity_arrays = EquityRecords::getSymbols();
        $base_url = self::BASE_URL . self::TARGET_QUERY;

        $current_time = date("Y-m-d-H-i-s");
        $all_equities = array();
        // ticker: AAPL, GE, AMD
        foreach ($equity_arrays as $ticker_arr) {
            $ticker = array_column($ticker_arr, 0);
            $equity_url = $base_url . implode("%2C", $ticker) . self::END_POINT;
            $raw_eauity_json = file_get_contents($equity_url);
            $raw_equity_array = json_decode($raw_eauity_json, true);
            $all_equities = array_merge($all_equities, $raw_equity_array);
        }

        $all_equities_json = json_encode($all_equities);

        $symbols_dir = __DIR__ . self::SYMBOLS_DIR;

        if (!is_dir($symbols_dir)) {mkdir($symbols_dir, 0755, true);}

        $raw_equity_file = $symbols_dir . "/" . $current_time . ".json";
        $fp = fopen($raw_equity_file, "x+");
        fwrite($fp, $all_equities_json);
        fclose($fp);
        echo "YAAAY! Equity JSON file success at " . __METHOD__ . " ! 💚 " . self::NEW_LINE;
    }

    /**
     * @return a string for var_dump
     */
    public static function p()
    {
        $args = func_get_args();
        $die = (end($args) === 1) && array_pop($args);
        echo self::NEW_LINE;
        foreach ($args as $v) {
            $output = print_r($v, true);
            var_dump($output) . self::NEW_LINE;
        }
        echo self::NEW_LINE;
        if ($die) {
            die();
        }

    }

}

SectorMovers

date_default_timezone_set("UTC");
ini_set('max_execution_time', 0);
ini_set('memory_limit', '-1');
set_time_limit(0);

require_once __DIR__ . "/EquityRecords.php";

SectorMovers::getSectors();

class SectorMovers
{

    /**
     *
     * @var a string of iextrading base URL
     */

    const BASE_URL = "https://api.iextrading.com/1.0/";

    /**
     *
     * @var a string of target path and query
     */

    const TARGET_QUERY = "stock/market/batch?symbols=";

    /**
     *
     * @var a string for backend path for every sector
     */

    const EACH_SECTOR_DIR_PREFIX = "/../../blog-back/sectors/real-time-60sec/z-raw-sector-";

    /**
     *
     * @var a string for backend path for index sector
     */

    const INDEX_SECTOR_DIR_PREFIX = "/../../blog-back/sectors/real-time-60sec/y-index/";

    /**
     *
     * @var a string for live data path
     */

    const LIVE_DATA_DIR = "/../../../public_html/blog/files/";
    const DIR_FRONT_SECTOR_COEF_FILENAME = "s-1.txt"; // Filename that records sector coefficient JSON

    public static function getSectors()
    {
        $base_url = self::BASE_URL . self::TARGET_QUERY;
        $current_time = date("Y-m-d-H-i-s");

        $permission = 0755;

        $index_data = array("Overall" => array("sector_weight" => 1, "sector_coefficient" => 1, "sector_value" => 0));
        $sector_movers = SectorMovers::iexSectorParams();
        foreach ($sector_movers as $sector_mover) {
            // $sector_url = $base_url . implode(",", array_keys($sector_mover["selected_tickers"])) . "&types=quote&range=1m";
            $sector_url = $base_url . implode("%2C", array_keys($sector_mover["selected_tickers"])) . "&types=quote&range=1m";
            $rawSectorJson = file_get_contents($sector_url);
            $raw_sector_array = json_decode($rawSectorJson, true);

            // ******************* Back Data ***************** //
            // Write the raw file in the back directories

            // $rawSectorDir =  __DIR__ . self::EACH_SECTOR_DIR_PREFIX . $sector_mover["directory"];

            // // if back directory not exist
            // if (!is_dir($rawSectorDir)) {mkdir($rawSectorDir, $permission, true);}

            // // create and open/write/close sector data to back directories
            // $rawSectorFile = $rawSectorDir . "/" . $current_time . ".json";
            // $fp = fopen($rawSectorFile, "a+");
            // fwrite($fp, $rawSectorJson);
            // fclose($fp);

            // ******************* End Back Data ***************** //

            // Calculate the real-time index
            $index_value = 0;
            foreach ($raw_sector_array as $ticker => $ticker_stats) {
                if (isset($sector_mover["selected_tickers"][$ticker], $ticker_stats["quote"], $ticker_stats["quote"]["extendedChangePercent"], $ticker_stats["quote"]["changePercent"], $ticker_stats["quote"]["ytdChange"])) {

                    $change_amount = ($ticker_stats["quote"]["extendedChangePercent"] + $ticker_stats["quote"]["changePercent"] + $ticker_stats["quote"]["ytdChange"]) / 200;
                    $index_value += $sector_mover["sector_weight"] * $sector_mover["selected_tickers"][$ticker] * $change_amount;
                }
            }

            $index_data[$sector_mover["sector"]] = array("sector_weight" => $sector_mover["sector_weight"], "sector_coefficient" => $sector_mover["sector_coefficient"], "sector_value" => $index_value);
            $index_data["Overall"]["sector_value"] += $index_data[$sector_mover["sector"]]["sector_value"];
        }

        // Calculate the index factor for better visibility between -1 and +1
        $front_index_data = array();
        foreach ($index_data as $sector_name => $sector_index_data) {

            // $index_sign = $sector_index_data["sector_value"];
            // if ($index_sign < 0) {
            //     $index_sign = - $index_sign;
            // }

            $index_sign = abs($sector_index_data["sector_value"]);

            $index_factor = 1;
            for ($i = 0; $i <= 10; $i++) {
                $index_factor = pow(10, $i);
                if (($index_factor * $index_sign) > 1) {
                    $index_factor = pow(10, $i - 1);
                    break;
                }
            }

            // $index_factor = 10 ** strlen(preg_match('~.K0+~', $float, $zeros) ? $zeros[0] : 0);

            $front_index_data[$sector_name] = $sector_index_data["sector_weight"] * $sector_index_data["sector_coefficient"] * $sector_index_data["sector_value"] * $index_factor;
        }

        // Write the index file
        $index_sector_dir = __DIR__ . self::INDEX_SECTOR_DIR_PREFIX;

        if (!is_dir($index_sector_dir)) {mkdir($index_sector_dir, $permission, true);}

        $index_sector_file = $index_sector_dir . $current_time . ".json";

        $index_sector_json = json_encode($front_index_data, JSON_FORCE_OBJECT);
        $fp = fopen($index_sector_file, "a+");
        fwrite($fp, $index_sector_json);
        fclose($fp);

        $sector_dir = __DIR__ . self::LIVE_DATA_DIR;

        if (!is_dir($sector_dir)) {mkdir($sector_dir, $permission, true);} // if data directory did not exist

        // if s-1 file did not exist
        if (!file_exists($sector_dir . self::DIR_FRONT_SECTOR_COEF_FILENAME)) {
            $handle = fopen($sector_dir . self::DIR_FRONT_SECTOR_COEF_FILENAME, "wb");
            fwrite($handle, "d");
            fclose($handle);
        }

        $sector_coef_file = $sector_dir . self::DIR_FRONT_SECTOR_COEF_FILENAME;
        copy($index_sector_file, $sector_coef_file);
        echo "YAAAY! " . __METHOD__ . " updated sector coefficients successfully 💚!n";

        return $front_index_data;
    }

    public static function iexSectorParams()
    {
        $sector_movers = array(
            array(
                "sector" => "IT",
                "directory" => "information-technology",
                "sector_weight" => 0.18,
                "sector_coefficient" => 4,
                "selected_tickers" => array(
                    "AAPL" => 0.18,
                    "AMZN" => 0.16,
                    "GOOGL" => 0.14,
                    "IBM" => 0.2,
                    "MSFT" => 0.1,
                    "FB" => 0.1,
                    "NFLX" => 0.08,
                    "ADBE" => 0.06,
                    "CRM" => 0.04,
                    "NVDA" => 0.02,
                ),
            ),
            array(
                "sector" => "Telecommunication",
                "directory" => "telecommunication-services",
                "sector_weight" => 0.12,
                "sector_coefficient" => 4,
                "selected_tickers" => array(
                    "VZ" => 0.18,
                    "CSCO" => 0.16,
                    "CMCSA" => 0.14,
                    "T" => 0.12,
                    "CTL" => 0.1,
                    "CHTR" => 0.1,
                    "S" => 0.08,
                    "DISH" => 0.06,
                    "USM" => 0.04,
                    "VOD" => 0.02,
                ),
            ),
            array(
                "sector" => "Finance",
                "directory" => "financial-services",
                "sector_weight" => 0.1,
                "sector_coefficient" => 6,
                "selected_tickers" => array(
                    "JPM" => 0.18,
                    "GS" => 0.16,
                    "V" => 0.14,
                    "BAC" => 0.12,
                    "AXP" => 0.1,
                    "WFC" => 0.1,
                    "USB" => 0.08,
                    "PNC" => 0.06,
                    "AMG" => 0.04,
                    "AIG" => 0.02,
                ),
            ),
            array(
                "sector" => "Energy",
                "directory" => "energy",
                "sector_weight" => 0.1,
                "sector_coefficient" => 6,
                "selected_tickers" => array(
                    "CVX" => 0.18,
                    "XOM" => 0.16,
                    "APA" => 0.14,
                    "COP" => 0.12,
                    "BHGE" => 0.1,
                    "VLO" => 0.1,
                    "APC" => 0.08,
                    "ANDV" => 0.06,
                    "OXY" => 0.04,
                    "HAL" => 0.02,
                ),
            ),
            array(
                "sector" => "Industrials",
                "directory" => "industrials",
                "sector_weight" => 0.08,
                "sector_coefficient" => 8,
                "selected_tickers" => array(
                    "CAT" => 0.18,
                    "FLR" => 0.16,
                    "GE" => 0.14,
                    "JEC" => 0.12,
                    "JCI" => 0.1,
                    "MAS" => 0.1,
                    "FLS" => 0.08,
                    "AAL" => 0.06,
                    "AME" => 0.04,
                    "CHRW" => 0.02,
                ),
            ),
            array(
                "sector" => "Materials and Chemicals",
                "directory" => "materials-and-chemicals",
                "sector_weight" => 0.08,
                "sector_coefficient" => 8,
                "selected_tickers" => array(
                    "DWDP" => 0.18,
                    "APD" => 0.16,
                    "EMN" => 0.14,
                    "ECL" => 0.12,
                    "FMC" => 0.1,
                    "LYB" => 0.1,
                    "MOS" => 0.08,
                    "NEM" => 0.06,
                    "PPG" => 0.04,
                    "MLM" => 0.02,
                ),
            ),
            array(
                "sector" => "Utilities",
                "directory" => "utilities",
                "sector_weight" => 0.08,
                "sector_coefficient" => 8,
                "selected_tickers" => array(
                    "PPL" => 0.18,
                    "PCG" => 0.16,
                    "SO" => 0.14,
                    "WEC" => 0.12,
                    "PEG" => 0.1,
                    "XEL" => 0.1,
                    "D" => 0.08,
                    "NGG" => 0.06,
                    "NEE" => 0.04,
                    "PNW" => 0.02,
                ),
            ),
            array(
                "sector" => "Consumer Discretionary",
                "directory" => "consumer-discretionary",
                "sector_weight" => 0.08,
                "sector_coefficient" => 8,
                "selected_tickers" => array(
                    "DIS" => 0.18,
                    "HD" => 0.16,
                    "BBY" => 0.14,
                    "CBS" => 0.12,
                    "CMG" => 0.1,
                    "MCD" => 0.1,
                    "GPS" => 0.08,
                    "HOG" => 0.06,
                    "AZO" => 0.04,
                    "EXPE" => 0.02,
                ),
            ),
            array(
                "sector" => "Consumer Staples",
                "directory" => "consumer-staples",
                "sector_weight" => 0.06,
                "sector_coefficient" => 8,
                "selected_tickers" => array(
                    "PEP" => 0.18,
                    "PM" => 0.16,
                    "PG" => 0.14,
                    "MNST" => 0.12,
                    "TSN" => 0.1,
                    "CPB" => 0.1,
                    "HRL" => 0.08,
                    "SJM" => 0.06,
                    "CAG" => 0.04,
                    "KHC" => 0.02,
                ),
            ),
            array(
                "sector" => "Defense",
                "directory" => "defense-and-aerospace",
                "sector_weight" => 0.04,
                "sector_coefficient" => 10,
                "selected_tickers" => array(
                    "BA" => 0.18,
                    "LMT" => 0.16,
                    "UTX" => 0.14,
                    "NOC" => 0.12,
                    "HON" => 0.1,
                    "RTN" => 0.1,
                    "TXT" => 0.08,
                    "LLL" => 0.06,
                    "COL" => 0.04,
                    "GD" => 0.02,
                ),
            ),
            array(
                "sector" => "Health",
                "directory" => "health-care-and-pharmaceuticals",
                "sector_weight" => 0.04,
                "sector_coefficient" => 10,
                "selected_tickers" => array(
                    "UNH" => 0.18,
                    "JNJ" => 0.16,
                    "PFE" => 0.14,
                    "UHS" => 0.12,
                    "AET" => 0.1,
                    "RMD" => 0.1,
                    "TMO" => 0.08,
                    "MRK" => 0.06,
                    "ABT" => 0.04,
                    "LLY" => 0.02,
                ),
            ),

            array(
                "sector" => "Real Estate",
                "directory" => "real-estate",
                "sector_weight" => 0.04,
                "sector_coefficient" => 10,
                "selected_tickers" => array(
                    "CCI" => 0.18,
                    "AMT" => 0.16,
                    "AVB" => 0.14,
                    "HCP" => 0.12,
                    "RCL" => 0.1,
                    "HST" => 0.1,
                    "NCLH" => 0.08,
                    "HLT" => 0.06,
                    "ARE" => 0.04,
                    "AIV" => 0.02,
                ),
            ),
        );
        return $sector_movers;
    }

}

Acknowledgement

I’d like to thank these users for being so helpful, which I could implement some of their advices in the code.


Get this bounty!!!

#StackBounty: #performance #beginner #php #file-system #api API equity scraping: Fast and/or efficient file writing using PHP

Bounty: 50

This is my first scripting project, and the only goal is making the scripts as fast/efficient as possible (performance) for a single machine. Would you be so kind and review my working codes and kindly help me with the goal?

There is only one main class (EQ) that does this job, by scraping equities data using EquityRecordsclass and calculating sector coefficients using
SectorMovers class.

I tried to add EQ class in the post. However, I couldn’t because of character limitations of the post. You can view it in this GitHub link.

Acknowledgment

I’d also like to thank the following users for their kind helps:

  • Andreas

  • ArtisticPhoenix

  • Dharman

  • kemicofa

  • Mick Harner

  • Nina Scholz

  • Oh My Goodness

  • Sᴀᴍ Onᴇᴌᴀ


Get this bounty!!!