TOP Python Interview Questions and Answers

Q1 - How would you do feature scaling in Python?

from sklearn.preprocessing import MinMaxScaler 

 scaler = MinMaxScaler() 

 original_data = pd.DataFrame(mydata[‘my_column’]) 

 scaled_data = pd.DataFrame(scaler.fit_transform(original_data)) 

Q2 - You have been given the list of positive integers from 1 to n. All the numbers from 1 to n are present except x, and you have to find x. Write code for that.

def find_missing_num(input): 

              sum_of_elements = sum(input) 

              n = len(input) + 1 

             real_sum = (n * ( n + 1 ) ) / 2 

             return int(real_sum - sum_of_elements) 

 mylist = [1,5,6,3,4] 

 find_missing_num(mylist)

Q3 - What is a pickle module in Python?

For serializing and de-serializing any given object in Python, we make use of the pickle module. In order to save given object on drive, we make use of pickle. It converts an object structure into character stream

Q4 - How Do You Get Indices of N Maximum Values in a Numpy Array?

>>import numpy as np 

 >>arr=np.array([10, 30, 20, 40, 50]) 

 >>print(arr.argsort( ) [ -N: ][: : -1])

Q5 - What is the difference between .pyc and .py file formats in Python?

.pyc files contain the compiled bytecode of Python source files. The Python interpreter loads .pyc files before .py files, so if they're present, it can save some time by not having to re-compile the Python source code.

Q6 - What are global variables and local variables in Python

A local variable is any variable declared within a function. This variable exists only in local space, not in global space. Global variables are variables declared outside of a function or in a global space. Any function in the program can access these variables.

Q7 - What are lambda functions?

Lambda functions are anonymous functions in Python. It's helpful when you need to define a function that's very short and consists of only one expression. So, instead of formally defining the small function with a specific name, body, and return statement, you can write everything in one short line of code using a lambda function. 

 Here's an example of how lambda functions are defined and used:

 (lambda x, y,: (x+y))

(3,2)

 5

Q8 - What is a negative index, and how is it used in Python?

A negative index is used in Python to index a list, string, or any other container class in reverse order (from the end). Thus, [-1] refers to the last element, [-2] refers to the second-to-last element, and so on.

Q9 - Do you know about vectorization in pandas?

Vectorization is basically the process of implementing operations on the dataframe without using loops. We instead use functions that are highly optimized. For example, if I want to calculate the sum of all the rows of a column in a dataframe, instead of looping over each row, I can use the aggregation functionality that pandas provides and calculate the sum.

Q10 - What is the use of PYTHONPATH

PYTHONPATH tells the python Interpreter where to locate module files imported into a program. The role is similar to PATH. PYTHONPATH includes both the source library directory and the source code directories.

Q10 - What’s the difference between / and // in Python?

Both / and // are division operators. However, / does float division, dividing the first operand by the second. / returns the value in decimal form. // does floor division, dividing the first operand by the second, but returns the value in natural number form.

  • An example: 9 / 2 returns 4.5
  • An example: 9 / 2 returns 4
Q11 -   Compare pandas and spark.
Pandas is a good choice for working with
small to medium-sized datasets, as it is relatively faster and easy to use. Spark is a better choice for working with large datasets, as it is more scalable and can handle more data. If the environment is Hadoop-based, spark integrates smoothly with it.

Q12 - You are given test scores, write python code to return bucketed scores of <50, <75, <90, <100.

def test_scores_bucket(df): 

                    bins = [0, 50, 75, 90, 100]

                   labels=['<50','<75','<90' , '<100']

                  df['test score'] = pd.cut(df['test score'], bins,labels=labels) 

                 return df

Q13 - How can you obtain the principal components and the eigenvalues from Scikit-Learn PCA?

from sklearn.decomposition import PCA 

 import numpy as np 

 data = np.array([[2.5, 2.4], [0.5, 0.7], [1.1, 0.9]]) 

 pca = PCA() 

 pca.fit(data) 

 eigenvectors 

 print(pca.components_) 

 eigenvalues 

 print(pca.explained_variance_)

Q14 - What is a Python dictionary and how do you use it?

A Python dictionary is an unordered collection of items, each defined by a key-value pair. You can create a dictionary using curly braces {} or the dict() function. For example: 

my_dict = {'name': 'Alice', 'age': 25}
print(my_dict['name']) # Outputs: Alice

Q15 - How do you handle missing values in a pandas DataFrame?

Missing values can be handled using various methods in pandas:
import pandas as pd
df = pd.DataFrame({'A': [1, 2, None], 'B': [4, None, 6]})
# Drop rows with any NaN values
df.dropna()
# Fill NaN values with a specific value
df.fillna(0)
# Fill NaN values using forward fill method
df.fillna(method='ffill')

Q16  - Explain the difference between loc and iloc in pandas.

loc is label-based, meaning you have to specify the names of the rows and columns you want to filter. iloc is integer index-based, so you must specify the rows and columns by their integer index.

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
print(df.loc[0:1, ['A', 'B']]) # Label-based
print(df.iloc[0:1, 0:2]) # Integer index-based

Q17  - What are Python list comprehensions and provide an example?

List comprehensions provide a concise way to create lists. It consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses.

squares = [x**2 for x in range(10)]
print(squares) # Outputs: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Q18  - How do you merge two DataFrames in pandas?

DataFrames can be merged using the merge() function in pandas. You can specify the keys on which to join the DataFrames.

df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['A', 'B', 'D'], 'value': [4, 5, 6]})
merged_df = pd.merge(df1, df2, on='key', how='inner')
print(merged_df)

Q19  - Explain the difference between map(), apply(), and applymap() in pandas.

  • map(): Used to substitute each value in a Series with another value.
  • apply(): Used to apply a function along an axis of the DataFrame.
  • applymap(): Used to apply a function element-wise across a DataFrame.
  • Q20  - How do you remove duplicates from a pandas DataFrame?

    df = pd.DataFrame({'A': [1, 2, 2, 3], 'B': [4, 5, 5, 6]})
    df = df.drop_duplicates()
    print(df)

    Q20  - Explain the concept of broadcasting in NumPy.

    Broadcasting describes how NumPy handles element-wise arithmetic operations with arrays of different shapes. It replicates the smaller array along the larger array’s shape so they have compatible shapes.

    import numpy as np
    a = np.array([1, 2, 3])
    b = np.array([[4], [5], [6]])
    result = a + b
    print(result) # Outputs: [[5 6 7], [6 7 8], [7 8 9]]

    Q21  - What is a decorator in Python, and how do you use it?

    A decorator is a function that takes another function and extends its behavior without explicitly modifying it. Decorators are often used for logging, access control, memoization, and more.

    def my_decorator(func):
    def wrapper():
    print("Something is happening before the function is called.")
    func()
    print("Something is happening after the function is called.")
    return wrapper
    @my_decorator
    def say_hello():
    print("Hello!")

    say_hello()

    Q22  - Explain the concept of metaclasses in Python.

    Metaclasses are classes of classes, meaning they define how classes behave. A class is an instance of a metaclass. By default, Python uses type as the metaclass, but you can define custom metaclasses to control class creation.

    class MyMeta(type):
    def __new__(cls, name, bases, dct):
    print(f'Creating class {name}')
    return super(MyMeta, cls).__new__(cls, name, bases, dct)

    class MyClass(metaclass=MyMeta):
    pass

    Q23  - How does the @staticmethod decorator differ from @classmethod?

    @staticmethod defines a method that doesn't operate on an instance or class level, effectively being a function within the class namespace. @classmethod, on the other hand, takes cls as the first parameter and can modify the class state

    class MyClass:
    @staticmethod
    def static_method():
    print("Static method")

    @classmethod
    def class_method(cls):
    print(f"Class method of {cls}")

    MyClass.static_method()
    MyClass.class_method()

    Q24  - What is a generator in Python, and how is it different from a normal function?

    Generators are functions that return an iterable set of items, one at a time, in a special way using yield instead of return. They save memory and are used for large datasets or streams.

    def my_generator():
    for i in range(5):
    yield I

    gen = my_generator()
    for value in gen:
    print(value)

    Q25  - What are Python coroutines, and how do they differ from generators?

    Coroutines are similar to generators but are used for cooperative multitasking. They can be paused and resumed, allowing for asynchronous I/O operations. Coroutines use async and await keywords.

    async def my_coroutine():
    print("Coroutine started")
    await asyncio.sleep(1)
    print("Coroutine ended")

    asyncio.run(my_coroutine())

    Q26  - What are context managers and the with statement used for in Python?

    Context managers handle the setup and teardown of resources, ensuring that resources are properly cleaned up after use. The with statement simplifies exception handling by encapsulating common preparation and cleanup tasks.

    with open('file.txt', 'w') as file:
    file.write('Hello, World!')

    Q27  - Explain Python's memory management and garbage collection.

    Python uses reference counting and a cyclic garbage collector for memory management. Reference counting tracks the number of references to an object in memory, while the garbage collector identifies and collects objects that are no longer in use, including those involved in reference cycles.

    Q28  - How do you create and use a custom exception in Python?

    Custom exceptions are created by subclassing the Exception class. They can add additional attributes and methods to the base exception class.

    Q29  - What is the purpose of the __new__ method in Python?

    __new__ is a static method responsible for creating a new instance of a class. It is called before __init__, and is typically used in singleton patterns or when subclassing immutable types.

    class Singleton:
    _instance = None

    def __new__(cls, *args, **kwargs):
    if not cls._instance:
    cls._instance = super(Singleton, cls).__new__(cls, *args, **kwargs)
    return cls._instance

    Q30  - Explain the Global Interpreter Lock (GIL) in Python. How does it affect multi-threading?

    The GIL is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. This means that even in a multi-threaded Python program, only one thread executes at a time. It simplifies memory management but can be a bottleneck in CPU-bound programs.