#StackBounty: #python #pip #python-module Figuring the required Python modules and their versions of a Python process

Bounty: 50

I have a tool which follows the system calls of a process. That way I know all the files/areas that were using by a process. I have a Python script which being executed (creates a process). I know all the files that were used during the run, such as the script itself. I also know the files of the modules that were used. The modules are installed in /tmp/vendor.

Based on the files inside /tmp/vendor that I found, I’m trying to figure the module name and module version so I could create a requirements file for the pip and then install them using pip install (to some other directory). Basically, I want to be able to know all the module dependencies of a Python process. Those modules could come from different areas but let’s focus on one (/tmp/vendor). The way I installed the modules into /tmp/vendor is just:

pip install --requirement requirements.txt --target /tmp/vendor

Now I want I to be able to build this requirements.txt file, based on the files in /tmp/vendor.

The solution could be dynamic or static. At first I tried to solve it in a static way – check the files in /tmp/vendor. I did an example – I installed requests:

pip install requests --target /tmp/vendor

As I understand, it installs the latest version. Inside vendor I have:

ls -la vendor/
total 52
drwxr-x--- 13 user group 4096 Sep 26 17:37 .
drwxr-x---  8 user group 4096 Sep 26 17:37 ..
drwxr-x---  2 user group 4096 Sep 26 17:37 bin
drwxr-x---  3 user group 4096 Sep 26 17:37 certifi
drwxr-x---  2 user group 4096 Sep 26 17:37 certifi-2021.5.30.dist-info
drwxr-x---  5 user group 4096 Sep 26 17:37 charset_normalizer
drwxr-x---  2 user group 4096 Sep 26 17:37 charset_normalizer-2.0.6.dist-info
drwxr-x---  3 user group 4096 Sep 26 17:37 idna
drwxr-x---  2 user group 4096 Sep 26 17:37 idna-3.2.dist-info
drwxr-x---  3 user group 4096 Sep 26 17:37 requests
drwxr-x---  2 user group 4096 Sep 26 17:37 requests-2.26.0.dist-info
drwxr-x---  6 user group 4096 Sep 26 17:37 urllib3
drwxr-x---  2 user group 4096 Sep 26 17:37 urllib3-1.26.7.dist-info

Now I can see that it also installs other modules that are needed, such as urllib3 and idna.
So my tool finds for example, that I were using:

/tmp/vendor/requests/utils.py

I also notice that each module is in format:

$NAME-(.*).dist-info

And the group is the version of the module. So at first I thought that I could parse for /tmp/vendor/(.*)/.* and get the module name ($NAME) and then look for $NAME-(.*).dist-info, but the problem is that I noticed that some module don’t have this *.dist-info directory so I could not figure the version of the module, which made me leave this approach.

I also tried some dynamic approaches – I know which python version was used and I could run python and try to load the module. But I could not figure a way to find the version of the module.

To summarize – I’m looking for a robust way to figure the modules the are required for my Python process in order to run. The modules should come with their version. All of the modules were installed using pip so it should simplify the task. How can it be done?


Get this bounty!!!

#StackBounty: #python #pytorch #layer How to find input layers names for intermediate layer in PyTorch model?

Bounty: 50

I have some complicated model on PyTorch. How can I print names of layers (or IDs) which connected to layer’s input. For start I want to find it for Concat layer. See example code below:

class Concat(nn.Module):
    def __init__(self, dimension=1):
        super().__init__()
        self.d = dimension

    def forward(self, x):
        return torch.cat(x, self.d)


class SomeModel(nn.Module):
    def __init__(self):
        super(SomeModel, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.conv2 = nn.Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        self.conc = Concat(1)
        self.linear = nn.Linear(8192, 1)

    def forward(self, x):
        out1 = F.relu(self.bn1(self.conv1(x)))
        out2 = F.relu(self.conv2(x))
        out = self.conc([out1, out2])
        out = F.avg_pool2d(out, 4)
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out


if __name__ == '__main__':
    model = SomeModel()
    print(model)
    y = model(torch.randn(1, 3, 32, 32))
    print(y.size())
    for name, m in model.named_modules():
        if 'Concat' in m.__class__.__name__:
            print(name, m, m.__class__.__name__)
            # Here print names of all input layers for Concat


Get this bounty!!!

#StackBounty: #python #image-processing Get mask of image without using OpenCV

Bounty: 150

I’m trying the following to get the mask out of this image, but unfortunately I fail.

import numpy as np
import skimage.color
import skimage.filters
import skimage.io

# get filename, sigma, and threshold value from command line
filename = 'pathToImage'

# read and display the original image
image = skimage.io.imread(fname=filename)
skimage.io.imshow(image)
# blur and grayscale before thresholding
blur = skimage.color.rgb2gray(image)
blur = skimage.filters.gaussian(blur, sigma=2)
# perform inverse binary thresholding
mask = blur < 0.8
# use the mask to select the "interesting" part of the image
sel = np.ones_like(image)
sel[mask] = image[mask]

# display the result
skimage.io.imshow(sel)

How can I obtain the mask?

enter image description here enter image description here

Is there a general approach that would work for this image as well. without custom fine-tuning and changing parameters?
enter image description here


Get this bounty!!!

#StackBounty: #python #pandas divide group data base on select columns values?

Bounty: 50

df

   ts_code    type  close

  0 861001.TI   1   648.399
  1 861001.TI   20  588.574
  2 861001.TI   30  621.926
  3 861001.TI   60  760.623
  4 861001.TI   90  682.313
  ...   ... ... ...
  8328  885933.TI   5   1083.141
  8329  885934.TI   1   951.493
  8330  885934.TI   5   1011.346
  8331  885935.TI   1   1086.558
  8332  885935.TI   5   1028.449

Goal

ts_code    l5d_close l20d_close …… l90d_close
861001.TI   NaN       1.10          0.95
……           ……       ……            ……

I want to groupby ts_code to calculate the close of type(1)/the close of type(N:5,20,30……). Take 861001.TI for example, l5d_close is nan because there is no value when the type is 5. l20d_close equals 648.399/588.574=1.10, l90d_close equals 648.399/682.313=0.95. And the result is rounded.

Try

df.groupby('ts_code')
  .pipe(lambda x: x[x.type==1].close/x[x.type==10].close)

Got: KeyError: 'Column not found: False'

The type values is: 1,5,20,30,60,90,180,200

Notice: There is one value of type columns for each ts_code


Get this bounty!!!

#StackBounty: #python #types #tuples #type-hinting #python-typing Tuple with multiple numbers of arbitrary but equal type

Bounty: 50

Currently, I am checking for tuples with multiple (e.g. three) numbers of arbitrary but equal type in the following form:

from typing import Tuple, Union

Union[Tuple[int, int, int], Tuple[float, float, float]]

I want to make this check more generic, also allowing numpy number types. I.e. I tried to use numbers.Number:

from numbers import Number
from typing import Tuple

Tuple[Number, Number, Number]

The above snipped also allows tuples of mixed types as long as everything is a number.

I’d like to restrict the tuple to numbers of equal type.

How can this be achieved?


Technically, this question applies to Python and the type hints specification itself. However, as pointed out in the comments, its handling is implementation specific, i.e. MyPy will not catch every edge case and/or inconsistency. Personally, I am using run-time checks with typeguard for testing and deactivate them entirely in production.


Get this bounty!!!

#StackBounty: #python #docker #scrapy #scrapy-splash #windows-server-2019 Connection was refused by other side: 10061: No connection co…

Bounty: 100

My steps:

  1. Build image docker build . -t scrapy
  2. Run a container docker run -it -p 8050:8050 --rm scrapy
  3. In container run scrapy project: scrapy crawl foobar -o allobjects.json

This works locally, but on my production server I get error:

[scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 1 times): Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it..

Note: I’m NOT using Docker Desktop, neither can I on this server.

Dockerfile

FROM mcr.microsoft.com/windows/servercore:ltsc2019

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

RUN setx /M PATH $('C:UsersContainerAdministratorminiconda3Librarybin;C:UsersContainerAdministratorminiconda3Scripts;C:UsersContainerAdministratorminiconda3;' + $Env:PATH)
RUN Invoke-WebRequest "https://repo.anaconda.com/miniconda/Miniconda3-py38_4.10.3-Windows-x86_64.exe" -OutFile miniconda3.exe -UseBasicParsing; 
    Start-Process -FilePath 'miniconda3.exe' -Wait -ArgumentList '/S', '/D=C:UsersContainerAdministratorminiconda3'; 
    Remove-Item .miniconda3.exe; 
    conda install -y -c conda-forge scrapy;

RUN pip install scrapy-splash
RUN pip install scrapy-user-agents
    
#creates root directory if not exists, then enters it
WORKDIR /root/scrapy

COPY scrapy /root/scrapy

settings.py

SPLASH_URL = 'http://localhost:8050/'

OUTPUT with command scrapy crawl foobar -o allobjects.json

2021-09-15 20:12:16 [scrapy.core.engine] INFO: Spider opened
2021-09-15 20:12:16 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min
)
2021-09-15 20:12:16 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2021-09-15 20:12:16 [py.warnings] WARNING: C:UsersContainerAdministratorminiconda3libsite-packagesscrapy_splashre
quest.py:41: ScrapyDeprecationWarning: Call to deprecated function to_native_str. Use to_unicode instead.
  url = to_native_str(url)

2021-09-15 20:12:16 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.117 Safari/537.36
2021-09-15 20:12:16 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.108 Safari/537.36
2021-09-15 20:12:17 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 1 times): Connection was refused by other side: 10061: No connection could be made because the target machine actively refused it..
2021-09-15 20:12:17 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; WOW64) App
leWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
2021-09-15 20:12:18 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 2 times): Connection was refused by other side: 10061: No connection
could be made because the target machine actively refused it..
2021-09-15 20:12:18 [scrapy_user_agents.middlewares] DEBUG: Assigned User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64
) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36
2021-09-15 20:12:19 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.example.com via http://localhost:8050/execute> (failed 3 times): Connection was refused by other side: 10061: No con
nection could be made because the target machine actively refused it..
2021-09-15 20:12:19 [scrapy.core.scraper] ERROR: Error downloading <GET https://www.example.com via http://localhost:8050/execute>
Traceback (most recent call last):
  File "C:UsersContainerAdministratorminiconda3libsite-packagesscrapycoredownloadermiddleware.py", line 45, in
process_request
    return (yield download_func(request=request, spider=spider))
twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 10061: No connection could be made
because the target machine actively refused it..
2021-09-15 20:12:19 [scrapy.core.engine] INFO: Closing spider (finished)
2021-09-15 20:12:19 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 3,
 'downloader/exception_type_count/twisted.internet.error.ConnectionRefusedError': 3,
 'downloader/request_bytes': 4632,
 'downloader/request_count': 3,
 'downloader/request_method_count/POST': 3,
 'elapsed_time_seconds': 3.310168,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2021, 9, 15, 18, 12, 19, 605641),
 'log_count/DEBUG': 6,
 'log_count/ERROR': 2,
 'log_count/INFO': 10,
 'log_count/WARNING': 46,
 'retry/count': 2,
 'retry/max_reached': 1,
 'retry/reason_count/twisted.internet.error.ConnectionRefusedError': 2,
 'scheduler/dequeued': 4,
 'scheduler/dequeued/memory': 4,
 'scheduler/enqueued': 4,
 'scheduler/enqueued/memory': 4,
 'splash/execute/request_count': 1,
 'start_time': datetime.datetime(2021, 9, 15, 18, 12, 16, 295473)}
2021-09-15 20:12:19 [scrapy.core.engine] INFO: Spider closed (finished)

What am I missing?

I already checked here:

UPDATE 1

I included EXPOSE 8050 in my Dockerfile, but get the same error. I tried netstat -a inside the docker container, but 8050 seems not to be in there?

C:rootscrapy>netstat -a

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    0.0.0.0:135            c60d48724046:0         LISTENING
  TCP    0.0.0.0:5985           c60d48724046:0         LISTENING
  TCP    0.0.0.0:47001          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49152          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49153          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49154          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49155          c60d48724046:0         LISTENING
  TCP    0.0.0.0:49159          c60d48724046:0         LISTENING
  TCP    [::]:135               c60d48724046:0         LISTENING
  TCP    [::]:5985              c60d48724046:0         LISTENING
  TCP    [::]:47001             c60d48724046:0         LISTENING
  TCP    [::]:49152             c60d48724046:0         LISTENING
  TCP    [::]:49153             c60d48724046:0         LISTENING
  TCP    [::]:49154             c60d48724046:0         LISTENING
  TCP    [::]:49155             c60d48724046:0         LISTENING
  TCP    [::]:49159             c60d48724046:0         LISTENING
  UDP    0.0.0.0:5353           *:*
  UDP    0.0.0.0:5355           *:*
  UDP    127.0.0.1:51352        *:*
  UDP    [::]:5353              *:*
  UDP    [::]:5355              *:*


Get this bounty!!!

#StackBounty: #python #d3.js #data-visualization Plotting catecorigal XY data including labels using Python (e. g. BCG matrices)

Bounty: 50

I like to draw 2×2 / BCG matrices. This time I have a rather big dataset (more than 50 topics and multiple values, e. g. A and B). I wonder how I can draw this using Python?

The result should look similiar to this:

enter image description here

I have found a couple of questions regarding scatter plots, but none of those really deals well with e.g. two topics with identical values (see topics 3,2,L,J,… above in the drawing).

The ID should be displayed in the drawing and ID’s with same set of values should not overlap, but stay rather close together.

Is there a way to do this? If not Python, I am also happy for other suggestions.

Here is an example dataset:

ID  Name        value_A     value_B
A   topic_1     2           4
B   topic_2     4           2
C   topic_3     3           3
D   topic_4     3           5
E   topic_5     3           4
F   topic_6     5           1
G   topic_7     4           5
H   topic_8     1           2
I   topic_9     4           1
J   topic_10    3           3
K   topic_11    5           5
L   topic_12    5           3
M   topic_13    3           5
N   topic_14    1           5
O   topic_15    4           1
P   topic_16    4           2
Q   topic_17    1           5
R   topic_18    2           3
S   topic_19    1           2
T   topic_20    5           1
U   topic_21    3           4
V   topic_22    2           5
W   topic_23    1           3
X   topic_24    3           3
Y   topic_25    4           1
Z   topic_26    2           4
1   topic_27    2           4
2   topic_28    5           4
3   topic_29    3           3
4   topic_30    4           4
5   topic_31    3           2
6   topic_32    4           2
7   topic_33    2           3
8   topic_34    2           3
9   topic_35    2           5
10  topic_36    4           2


Get this bounty!!!

#StackBounty: #python #django How to run a function with parameters periodicaly?

Bounty: 50

I have a form(DashboardData model) on my system that each user fills out. In this form, we are asking for a username, password, and a range to update for another system. Options in this range are once a week, once a month, and once a year.

Depending on which interval the user selects, I will run a function at those intervals. In this function, I will get the username and password from the form filled by the user as parameters. For example in form username, password, once a week is selected then I have to run myFunction(username, password) once a week.

I try to use apscheduler for this. But in apps.py I can not reach request.user, so, I cannot get the data. I have to take request.user for running my function.

forms.py

class SetupForm(forms.ModelForm):
# Time
PERIOD = (
    ('168', 'Once a week'),
    ('720', 'Once a month'),
    ('8766 ', 'Once a year'),
)
    n_username = forms.CharField()
    n_password = forms.CharField(widget=forms.PasswordInput)
    period = forms.CharField(max_length=200, widget=forms.Select(choices=PERIOD)))

    class Meta:
        model = DashboardData
        fields = ('n_username', 'n_password','period')

models.py

class DashboardData(models.Model):
    user = models.ForeignKey(UserProfile, on_delete=models.CASCADE, null=True) # request.user
    n_username = models.CharField(max_length=250)
    n_password = models.CharField(max_length=250)
    period = models.CharField(max_length=250)

functions.py

class myFunction():
    def __init__(self, n_user, n_password):
        self.time = datetime.now().strftime("%Y-%m-%d")
        self.location = a_url
        self.static_fields = {}
        self.index_name = "pre-" + self.zaman
        self.download_all(n_user, n_password)
        self.send_it()

apps.py

class DashboardConfig(AppConfig):
    default_auto_field = 'django.db.models.BigAutoField'
    name = 'dashboard'
    def ready(self):
        start()


Get this bounty!!!

#StackBounty: #python #api #article #scopus #httpx How to Use Elsevier Article Retrieval API to get fulltext of paper

Bounty: 50

I want to use Elsevier Article Retrieval API (https://dev.elsevier.com/documentation/FullTextRetrievalAPI.wadl) to get fulltext of paper.

I use httpx to get the information of the paper,but it just contains some information.My code is below:

import httpx
import time


def scopus_paper_date(paper_doi,apikey):
    apikey=apikey
    headers={
        "X-ELS-APIKey":apikey,
        "Accept":'text/xml'
         }

    timeout = httpx.Timeout(10.0, connect=60.0)
    client = httpx.Client(timeout=timeout,headers=headers)
    query="&view=FULL"
    url=f"https://api.elsevier.com/content/article/doi/" + paper_doi
    r=client.get(url)
    print(r)
    return r.text

y = scopus_paper_date('10.1016/j.solmat.2021.111326',myapikey)
y

the result is below:

<full-text-retrieval-response xmlns="http://www.elsevier.com/xml/svapi/article/dtd" xmlns:bk="http://www.elsevier.com/xml/bk/dtd" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:sa="http://www.elsevier.com/xml/common/struct-aff/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prism="http://prismstandard.org/namespaces/basic/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><coredata><prism:url>https://api.elsevier.com/content/article/pii/S0927024821003688</prism:url>....

how can i get the fulldata of the paper,many thanks!


Get this bounty!!!

#StackBounty: #python #pandas Pandas Group By With Overlapping Bins

Bounty: 50

I want to sum up data across overlapping bins. Basically the question here but instead of the bins being (0-8 years old), (9 – 17 years old), (18-26 years old), (27-35 years old), and (26 – 44 years old) I want them to be (0-8 years old), (1 – 9 years old), (2-10 years old), (3-11 years old), and (4 – 12 years old).

Starting with a df like this

id awards age
1 100 24
1 150 26
1 50 54
2 193 34
2 209 50

I am using the code from this answer to calculate summation across non-overlapping bins.

bins = [9 * i for i in range(0, df['age'].max() // 9 + 2)]
cuts = pd.cut(df['age'], bins, right=False)

print(cuts)

0    [18, 27)
1    [18, 27)
2    [54, 63)
3    [27, 36)
4    [45, 54)
Name: age, dtype: category
Categories (7, interval[int64, left]): [[0, 9) < [9, 18) < [18, 27) < [27, 36) < [36, 45) < [45, 54) < [54, 63)]

df_out = (df.groupby(['id', cuts])
            .agg(total_awards=('awards', 'sum'))
            .reset_index(level=0)
            .reset_index(drop=True)
         )
df_out['age_interval'] = df_out.groupby('id').cumcount()

Result

print(df_out)

    id  total_awards  age_interval
0    1             0             0
1    1             0             1
2    1           250             2
3    1             0             3
4    1             0             4
5    1             0             5
6    1            50             6
7    2             0             0
8    2             0             1
9    2             0             2
10   2           193             3
11   2             0             4
12   2           209             5
13   2             0             6

Is it possible to work off the existing code to do this with overlapping bins?


Get this bounty!!!