Module imodelsx.linear_ngram
Simple scikit-learn interface for finetuning a single linear layer on top of LLM embeddings.
Classes
class LinearNgram (checkpoint: str = 'tfidfvectorizer',
tokenizer=None,
ngrams=2,
all_ngrams=True,
random_state=None)-
Expand source code
class LinearNgram(BaseEstimator): def __init__( self, checkpoint: str = "tfidfvectorizer", tokenizer=None, ngrams=2, all_ngrams=True, random_state=None, ): """LinearNgram Class - use either LinearNgramClassifier or LinearNgramRegressor rather than initializing this class directly. Parameters ---------- checkpoint: str Name of vectorizer checkpoint: "countvectorizer" or "tfidfvectorizer" ngrams Order of ngrams to extract. 1 for unigrams, 2 for bigrams, etc. all_ngrams Whether to use all order ngrams <= ngrams argument random_state random seed for fitting Example ------- ``` from imodelsx import LinearNgramClassifier import datasets import numpy as np # load data dset = datasets.load_dataset('rotten_tomatoes')['train'] dset = dset.select(np.random.choice(len(dset), size=300, replace=False)) dset_val = datasets.load_dataset('rotten_tomatoes')['validation'] dset_val = dset_val.select(np.random.choice(len(dset_val), size=300, replace=False)) # fit a simple ngram model m = LinearNgramClassifier() m.fit(dset['text'], dset['label']) preds = m.predict(dset_val['text']) acc = (preds == dset_val['label']).mean() print('validation acc', acc) ``` """ assert checkpoint in ["countvectorizer", "tfidfvectorizer"] self.checkpoint = checkpoint self.tokenizer = tokenizer self.ngrams = ngrams self.all_ngrams = all_ngrams self.random_state = random_state def fit( self, X_text: ArrayLike, y: ArrayLike, verbose=True, ): """Extract embeddings then fit linear model Parameters ---------- X_text: ArrayLike[str] y: ArrayLike[str] """ # metadata if isinstance(self, ClassifierMixin): self.classes_ = unique_labels(y) if self.random_state is not None: np.random.seed(self.random_state) # set up model if verbose: print("initializing model...") # get embs if verbose: print("calculating embeddings...") if self.all_ngrams: lower_ngram = 1 else: lower_ngram = self.ngrams # get vectorizer if self.checkpoint == "countvectorizer": self.vectorizer = CountVectorizer( tokenizer=self.tokenizer, ngram_range=( lower_ngram, self.ngrams) ) elif self.checkpoint == "tfidfvectorizer": self.vectorizer = TfidfVectorizer( tokenizer=self.tokenizer, ngram_range=( lower_ngram, self.ngrams) ) # get embs embs = self.vectorizer.fit_transform(X_text) # train linear warnings.filterwarnings("ignore", category=ConvergenceWarning) if verbose: print("training linear model...") if isinstance(self, ClassifierMixin): self.linear = LogisticRegressionCV() elif isinstance(self, RegressorMixin): self.linear = RidgeCV() self.linear.fit(embs, y) return self def predict(self, X_text): """For regression returns continuous output. For classification, returns discrete output. """ check_is_fitted(self) embs = self.vectorizer.transform(X_text) return self.linear.predict(embs) def predict_proba(self, X_text): check_is_fitted(self) embs = self.vectorizer.transform(X_text) return self.linear.predict_proba(embs)
Base class for all estimators in scikit-learn.
Inheriting from this class provides default implementations of:
- setting and getting parameters used by
GridSearchCV
and friends; - textual and HTML representation displayed in terminals and IDEs;
- estimator serialization;
- parameters validation;
- data validation;
- feature names validation.
Read more in the :ref:
User Guide <rolling_your_own_estimator>
.Notes
All estimators should specify all the parameters that can be set at the class level in their
__init__
as explicit keyword arguments (no*args
or**kwargs
).Examples
>>> import numpy as np >>> from sklearn.base import BaseEstimator >>> class MyEstimator(BaseEstimator): ... def __init__(self, *, param=1): ... self.param = param ... def fit(self, X, y=None): ... self.is_fitted_ = True ... return self ... def predict(self, X): ... return np.full(shape=X.shape[0], fill_value=self.param) >>> estimator = MyEstimator(param=2) >>> estimator.get_params() {'param': 2} >>> X = np.array([[1, 2], [2, 3], [3, 4]]) >>> y = np.array([1, 0, 1]) >>> estimator.fit(X, y).predict(X) array([2, 2, 2]) >>> estimator.set_params(param=3).fit(X, y).predict(X) array([3, 3, 3])
LinearNgram Class - use either LinearNgramClassifier or LinearNgramRegressor rather than initializing this class directly.
Parameters
checkpoint
:str
- Name of vectorizer checkpoint: "countvectorizer" or "tfidfvectorizer"
ngrams
- Order of ngrams to extract. 1 for unigrams, 2 for bigrams, etc.
all_ngrams
- Whether to use all order ngrams <= ngrams argument
random_state
- random seed for fitting
Example
from imodelsx import LinearNgramClassifier import datasets import numpy as np # load data dset = datasets.load_dataset('rotten_tomatoes')['train'] dset = dset.select(np.random.choice(len(dset), size=300, replace=False)) dset_val = datasets.load_dataset('rotten_tomatoes')['validation'] dset_val = dset_val.select(np.random.choice(len(dset_val), size=300, replace=False)) # fit a simple ngram model m = LinearNgramClassifier() m.fit(dset['text'], dset['label']) preds = m.predict(dset_val['text']) acc = (preds == dset_val['label']).mean() print('validation acc', acc)
Ancestors
- sklearn.base.BaseEstimator
- sklearn.utils._repr_html.base.ReprHTMLMixin
- sklearn.utils._repr_html.base._HTMLDocumentationLinkMixin
- sklearn.utils._metadata_requests._MetadataRequester
Subclasses
Methods
def fit(self,
X_text: numpy._typing._array_like._Buffer | numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]] | numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]] | bool | int | float | complex | str | bytes | numpy._typing._nested_sequence._NestedSequence[bool | int | float | complex | str | bytes],
y: numpy._typing._array_like._Buffer | numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]] | numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[typing.Any]]] | bool | int | float | complex | str | bytes | numpy._typing._nested_sequence._NestedSequence[bool | int | float | complex | str | bytes],
verbose=True)-
Expand source code
def fit( self, X_text: ArrayLike, y: ArrayLike, verbose=True, ): """Extract embeddings then fit linear model Parameters ---------- X_text: ArrayLike[str] y: ArrayLike[str] """ # metadata if isinstance(self, ClassifierMixin): self.classes_ = unique_labels(y) if self.random_state is not None: np.random.seed(self.random_state) # set up model if verbose: print("initializing model...") # get embs if verbose: print("calculating embeddings...") if self.all_ngrams: lower_ngram = 1 else: lower_ngram = self.ngrams # get vectorizer if self.checkpoint == "countvectorizer": self.vectorizer = CountVectorizer( tokenizer=self.tokenizer, ngram_range=( lower_ngram, self.ngrams) ) elif self.checkpoint == "tfidfvectorizer": self.vectorizer = TfidfVectorizer( tokenizer=self.tokenizer, ngram_range=( lower_ngram, self.ngrams) ) # get embs embs = self.vectorizer.fit_transform(X_text) # train linear warnings.filterwarnings("ignore", category=ConvergenceWarning) if verbose: print("training linear model...") if isinstance(self, ClassifierMixin): self.linear = LogisticRegressionCV() elif isinstance(self, RegressorMixin): self.linear = RidgeCV() self.linear.fit(embs, y) return self
Extract embeddings then fit linear model
Parameters
X_text
:ArrayLike[str]
y
:ArrayLike[str]
def predict(self, X_text)
-
Expand source code
def predict(self, X_text): """For regression returns continuous output. For classification, returns discrete output. """ check_is_fitted(self) embs = self.vectorizer.transform(X_text) return self.linear.predict(embs)
For regression returns continuous output. For classification, returns discrete output.
def predict_proba(self, X_text)
-
Expand source code
def predict_proba(self, X_text): check_is_fitted(self) embs = self.vectorizer.transform(X_text) return self.linear.predict_proba(embs)
def set_fit_request(self: LinearNgram,
*,
X_text: bool | str | None = '$UNCHANGED$',
verbose: bool | str | None = '$UNCHANGED$') ‑> LinearNgram-
Expand source code
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instance
Configure whether metadata should be requested to be passed to the
fit
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:`meta-estimator` and metadata routing is enabled with ``enable_metadata_routing=True`` (see :func:<code>sklearn.set\_config</code>). Please check the :ref:`User Guide <metadata_routing>` on how the routing mechanism works. The options for each parameter are: - <code>True</code>: metadata is requested, and passed to <code>fit</code> if provided. The request is ignored if metadata is not provided. - <code>False</code>: metadata is not requested and the meta-estimator will not pass it to <code>fit</code>. - <code>None</code>: metadata is not requested, and the meta-estimator will raise an error if the user provides it. - <code>str</code>: metadata should be passed to the meta-estimator with this given alias instead of the original name. The default (<code>sklearn.utils.metadata\_routing.UNCHANGED</code>) retains the existing request. This allows you to change the request for some parameters and not others. !!! versionadded "Added in version: 1.3" Parameters ---------- X_text : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED Metadata routing for <code>X\_text</code> parameter in <code>fit</code>. verbose : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED Metadata routing for <code>verbose</code> parameter in <code>fit</code>. Returns ------- self : object The updated object.
def set_predict_proba_request(self: LinearNgram,
*,
X_text: bool | str | None = '$UNCHANGED$') ‑> LinearNgram-
Expand source code
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instance
Configure whether metadata should be requested to be passed to the
predict_proba
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:`meta-estimator` and metadata routing is enabled with ``enable_metadata_routing=True`` (see :func:<code>sklearn.set\_config</code>). Please check the :ref:`User Guide <metadata_routing>` on how the routing mechanism works. The options for each parameter are: - <code>True</code>: metadata is requested, and passed to <code>predict\_proba</code> if provided. The request is ignored if metadata is not provided. - <code>False</code>: metadata is not requested and the meta-estimator will not pass it to <code>predict\_proba</code>. - <code>None</code>: metadata is not requested, and the meta-estimator will raise an error if the user provides it. - <code>str</code>: metadata should be passed to the meta-estimator with this given alias instead of the original name. The default (<code>sklearn.utils.metadata\_routing.UNCHANGED</code>) retains the existing request. This allows you to change the request for some parameters and not others. !!! versionadded "Added in version: 1.3" Parameters ---------- X_text : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED Metadata routing for <code>X\_text</code> parameter in <code>predict\_proba</code>. Returns ------- self : object The updated object.
def set_predict_request(self: LinearNgram,
*,
X_text: bool | str | None = '$UNCHANGED$') ‑> LinearNgram-
Expand source code
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instance
Configure whether metadata should be requested to be passed to the
predict
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:`meta-estimator` and metadata routing is enabled with ``enable_metadata_routing=True`` (see :func:<code>sklearn.set\_config</code>). Please check the :ref:`User Guide <metadata_routing>` on how the routing mechanism works. The options for each parameter are: - <code>True</code>: metadata is requested, and passed to <code>predict</code> if provided. The request is ignored if metadata is not provided. - <code>False</code>: metadata is not requested and the meta-estimator will not pass it to <code>predict</code>. - <code>None</code>: metadata is not requested, and the meta-estimator will raise an error if the user provides it. - <code>str</code>: metadata should be passed to the meta-estimator with this given alias instead of the original name. The default (<code>sklearn.utils.metadata\_routing.UNCHANGED</code>) retains the existing request. This allows you to change the request for some parameters and not others. !!! versionadded "Added in version: 1.3" Parameters ---------- X_text : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED Metadata routing for <code>X\_text</code> parameter in <code>predict</code>. Returns ------- self : object The updated object.
- setting and getting parameters used by
class LinearNgramClassifier (checkpoint: str = 'tfidfvectorizer',
tokenizer=None,
ngrams=2,
all_ngrams=True,
random_state=None)-
Expand source code
class LinearNgramClassifier(LinearNgram, ClassifierMixin): ...
Base class for all estimators in scikit-learn.
Inheriting from this class provides default implementations of:
- setting and getting parameters used by
GridSearchCV
and friends; - textual and HTML representation displayed in terminals and IDEs;
- estimator serialization;
- parameters validation;
- data validation;
- feature names validation.
Read more in the :ref:
User Guide <rolling_your_own_estimator>
.Notes
All estimators should specify all the parameters that can be set at the class level in their
__init__
as explicit keyword arguments (no*args
or**kwargs
).Examples
>>> import numpy as np >>> from sklearn.base import BaseEstimator >>> class MyEstimator(BaseEstimator): ... def __init__(self, *, param=1): ... self.param = param ... def fit(self, X, y=None): ... self.is_fitted_ = True ... return self ... def predict(self, X): ... return np.full(shape=X.shape[0], fill_value=self.param) >>> estimator = MyEstimator(param=2) >>> estimator.get_params() {'param': 2} >>> X = np.array([[1, 2], [2, 3], [3, 4]]) >>> y = np.array([1, 0, 1]) >>> estimator.fit(X, y).predict(X) array([2, 2, 2]) >>> estimator.set_params(param=3).fit(X, y).predict(X) array([3, 3, 3])
LinearNgram Class - use either LinearNgramClassifier or LinearNgramRegressor rather than initializing this class directly.
Parameters
checkpoint
:str
- Name of vectorizer checkpoint: "countvectorizer" or "tfidfvectorizer"
ngrams
- Order of ngrams to extract. 1 for unigrams, 2 for bigrams, etc.
all_ngrams
- Whether to use all order ngrams <= ngrams argument
random_state
- random seed for fitting
Example
from imodelsx import LinearNgramClassifier import datasets import numpy as np # load data dset = datasets.load_dataset('rotten_tomatoes')['train'] dset = dset.select(np.random.choice(len(dset), size=300, replace=False)) dset_val = datasets.load_dataset('rotten_tomatoes')['validation'] dset_val = dset_val.select(np.random.choice(len(dset_val), size=300, replace=False)) # fit a simple ngram model m = LinearNgramClassifier() m.fit(dset['text'], dset['label']) preds = m.predict(dset_val['text']) acc = (preds == dset_val['label']).mean() print('validation acc', acc)
Ancestors
- LinearNgram
- sklearn.base.BaseEstimator
- sklearn.utils._repr_html.base.ReprHTMLMixin
- sklearn.utils._repr_html.base._HTMLDocumentationLinkMixin
- sklearn.utils._metadata_requests._MetadataRequester
- sklearn.base.ClassifierMixin
Methods
def set_score_request(self: LinearNgramClassifier,
*,
sample_weight: bool | str | None = '$UNCHANGED$') ‑> LinearNgramClassifier-
Expand source code
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instance
Configure whether metadata should be requested to be passed to the
score
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:`meta-estimator` and metadata routing is enabled with ``enable_metadata_routing=True`` (see :func:<code>sklearn.set\_config</code>). Please check the :ref:`User Guide <metadata_routing>` on how the routing mechanism works. The options for each parameter are: - <code>True</code>: metadata is requested, and passed to <code>score</code> if provided. The request is ignored if metadata is not provided. - <code>False</code>: metadata is not requested and the meta-estimator will not pass it to <code>score</code>. - <code>None</code>: metadata is not requested, and the meta-estimator will raise an error if the user provides it. - <code>str</code>: metadata should be passed to the meta-estimator with this given alias instead of the original name. The default (<code>sklearn.utils.metadata\_routing.UNCHANGED</code>) retains the existing request. This allows you to change the request for some parameters and not others. !!! versionadded "Added in version: 1.3" Parameters ---------- sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED Metadata routing for <code>sample\_weight</code> parameter in <code>score</code>. Returns ------- self : object The updated object.
Inherited members
- setting and getting parameters used by
class LinearNgramRegressor (checkpoint: str = 'tfidfvectorizer',
tokenizer=None,
ngrams=2,
all_ngrams=True,
random_state=None)-
Expand source code
class LinearNgramRegressor(LinearNgram, RegressorMixin): ...
Base class for all estimators in scikit-learn.
Inheriting from this class provides default implementations of:
- setting and getting parameters used by
GridSearchCV
and friends; - textual and HTML representation displayed in terminals and IDEs;
- estimator serialization;
- parameters validation;
- data validation;
- feature names validation.
Read more in the :ref:
User Guide <rolling_your_own_estimator>
.Notes
All estimators should specify all the parameters that can be set at the class level in their
__init__
as explicit keyword arguments (no*args
or**kwargs
).Examples
>>> import numpy as np >>> from sklearn.base import BaseEstimator >>> class MyEstimator(BaseEstimator): ... def __init__(self, *, param=1): ... self.param = param ... def fit(self, X, y=None): ... self.is_fitted_ = True ... return self ... def predict(self, X): ... return np.full(shape=X.shape[0], fill_value=self.param) >>> estimator = MyEstimator(param=2) >>> estimator.get_params() {'param': 2} >>> X = np.array([[1, 2], [2, 3], [3, 4]]) >>> y = np.array([1, 0, 1]) >>> estimator.fit(X, y).predict(X) array([2, 2, 2]) >>> estimator.set_params(param=3).fit(X, y).predict(X) array([3, 3, 3])
LinearNgram Class - use either LinearNgramClassifier or LinearNgramRegressor rather than initializing this class directly.
Parameters
checkpoint
:str
- Name of vectorizer checkpoint: "countvectorizer" or "tfidfvectorizer"
ngrams
- Order of ngrams to extract. 1 for unigrams, 2 for bigrams, etc.
all_ngrams
- Whether to use all order ngrams <= ngrams argument
random_state
- random seed for fitting
Example
from imodelsx import LinearNgramClassifier import datasets import numpy as np # load data dset = datasets.load_dataset('rotten_tomatoes')['train'] dset = dset.select(np.random.choice(len(dset), size=300, replace=False)) dset_val = datasets.load_dataset('rotten_tomatoes')['validation'] dset_val = dset_val.select(np.random.choice(len(dset_val), size=300, replace=False)) # fit a simple ngram model m = LinearNgramClassifier() m.fit(dset['text'], dset['label']) preds = m.predict(dset_val['text']) acc = (preds == dset_val['label']).mean() print('validation acc', acc)
Ancestors
- LinearNgram
- sklearn.base.BaseEstimator
- sklearn.utils._repr_html.base.ReprHTMLMixin
- sklearn.utils._repr_html.base._HTMLDocumentationLinkMixin
- sklearn.utils._metadata_requests._MetadataRequester
- sklearn.base.RegressorMixin
Methods
def set_score_request(self: LinearNgramRegressor,
*,
sample_weight: bool | str | None = '$UNCHANGED$') ‑> LinearNgramRegressor-
Expand source code
def func(*args, **kw): """Updates the `_metadata_request` attribute of the consumer (`instance`) for the parameters provided as `**kw`. This docstring is overwritten below. See REQUESTER_DOC for expected functionality. """ if not _routing_enabled(): raise RuntimeError( "This method is only available when metadata routing is enabled." " You can enable it using" " sklearn.set_config(enable_metadata_routing=True)." ) if self.validate_keys and (set(kw) - set(self.keys)): raise TypeError( f"Unexpected args: {set(kw) - set(self.keys)} in {self.name}. " f"Accepted arguments are: {set(self.keys)}" ) # This makes it possible to use the decorated method as an unbound method, # for instance when monkeypatching. # https://github.com/scikit-learn/scikit-learn/issues/28632 if instance is None: _instance = args[0] args = args[1:] else: _instance = instance # Replicating python's behavior when positional args are given other than # `self`, and `self` is only allowed if this method is unbound. if args: raise TypeError( f"set_{self.name}_request() takes 0 positional argument but" f" {len(args)} were given" ) requests = _instance._get_metadata_request() method_metadata_request = getattr(requests, self.name) for prop, alias in kw.items(): if alias is not UNCHANGED: method_metadata_request.add_request(param=prop, alias=alias) _instance._metadata_request = requests return _instance
Configure whether metadata should be requested to be passed to the
score
method.Note that this method is only relevant when this estimator is used as a sub-estimator within a :term:`meta-estimator` and metadata routing is enabled with ``enable_metadata_routing=True`` (see :func:<code>sklearn.set\_config</code>). Please check the :ref:`User Guide <metadata_routing>` on how the routing mechanism works. The options for each parameter are: - <code>True</code>: metadata is requested, and passed to <code>score</code> if provided. The request is ignored if metadata is not provided. - <code>False</code>: metadata is not requested and the meta-estimator will not pass it to <code>score</code>. - <code>None</code>: metadata is not requested, and the meta-estimator will raise an error if the user provides it. - <code>str</code>: metadata should be passed to the meta-estimator with this given alias instead of the original name. The default (<code>sklearn.utils.metadata\_routing.UNCHANGED</code>) retains the existing request. This allows you to change the request for some parameters and not others. !!! versionadded "Added in version: 1.3" Parameters ---------- sample_weight : str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED Metadata routing for <code>sample\_weight</code> parameter in <code>score</code>. Returns ------- self : object The updated object.
Inherited members
- setting and getting parameters used by