Data For kubeflow/tf-operator

Only showing last 50 predictions


# Title Body Link Prediction Confidence Labeled?
1129 Support get training process via Python SDK /feature `Kubeflow-tfjob` SDK should support getting training process/logs, so that user can... feature_request 0.99 True 0 0
1121 Popgroup is not be created automatically. Maybe the api of kube-batch is out-of-date? And my tf version is... bug 0.57 True 0 0
1115 TFConfig should be demonstrated more specifically. When I glanced at tf-operator example, I wondered how tf-operator map tensorflow-native code to... feature_request 0.8 True 0 0
1113 [chore] Remove tfjob dashboard Now we do not maintain the dashboard. Thus we should remove it in the repo and in the next... feature_request 0.61 True 0 0
1112 read TF_CONFIG env from configMap TF_CONFIG ENV was set on every pod with same content for now, we should set the env in... feature_request 0.73 True 0 0
1101 Long job names result in jobs stuck forever If your `TFJob` defines a `metadata.name` that is too long the operator will be unable to... bug 0.95 True 1 0
1099 Question: can't the base image "registry.access.redhat.com/ubi8/ubi:latest" in Dockerfile be replaces with "debian:buster" ? Hello guys: Can't the base image "registry.access.redhat.com/ubi8/ubi:latest" in Dockerfile... question 0.51 False 0 0
1096 can i install tf-operator alone without kubeflow? i have a k8s cluster already. and i want to use tf-operator to run tfjob。 but when i... question 0.92 True 1 0
1095 c feature_request 0.6 True 0 0
1094 TFJob test is failing on master and v0.7 branch for kubeflow/kubeflow Here's a link to a recent periodic... bug 0.9 True 0 0
1093 TFJob tests should use pytest The... feature_request 0.69 True 0 0
1091 Multiple Evaluator replicas gives InvalidTFJobSpec If I set `replicas: 2` in the `Evaluator` spec, I get `InvalidTFJobSpec 4s tf-operator ... bug 0.94 True 0 0
1090 Java client for current version of TFjob There is a repo, [tfjob-java-client](https://github.com/kubeflow-incubator/tfjob-java-client),... feature_request 0.52 False 0 0
1087 [enhancement] Replace common with kubeflow/common It is better to use kubeflow/common and remove the common package in tf-operator. /cc... feature_request 0.97 True 0 0
1086 Lack of documents for deployment I want to deploy a tf-operator on my cluster, but I can't find documents about it. does anybody... question 0.64 True 0 0
1079 Performance problem about pod informer ``` // Create pod informer. podInformer := kubeInformerFactory.Core().V1().Pods() // Set... feature_request 0.44 False 0 0
1078 [bug] Cannot initialize the training job when the user uses 1 worker and 0 PS There are some users want to use TFJob to run local training jobs with Estimator. They will have... bug 0.9 True 1 0
1077 Separate cluster scoped and namespace scoped resources Part of #1076 feature_request 0.92 True 0 0
1076 TFJob 1.0 Description | Category | Status | Issue -- | -- | -- | -- Kustomize package | Required |... feature_request 0.48 False 0 0
1068 [bug] Keep tf-job-role as deprecated label in this version Now we keep tf-job-name as the deprecated label. Thus I think we should keep the tf-job-role as well. bug 0.72 True 0 0
1066 GenLabels may select wrong Pods https://github.com/kubeflow/tf-operator/pull/1064 https://github.com/kubeflow/pytorch-operator/p... bug 0.91 True 0 0
1065 Can I create a tf-operator pod without using GO? I am unable to run my "TFJob pod" because I do not have a "tf-operator Pod". I have created a... bug 0.45 False 0 0
1060 tf-job-dashboard cannot work After install Kubeflow 0.6, tf-job-dashboard cannot work <img width="869" alt="image"... bug 0.96 True 1 0
1059 [discussion] Should We Add CleanPodPolicy PS? Now we have ``` CleanPodPolicyUndefined CleanPodPolicy = "" CleanPodPolicyAll ... question 0.83 True 0 0
1058 Refactor dockerfile In tf-operator... feature_request 0.71 True 0 0
1057 remove v1beta1 in v0.5.3 cause incompatible issue when using go mod We use tf-operator@v0.5.0 as our dependency, and go mod was try to get latest version of v0.5,... bug 0.75 True 0 1
1056 Invalid value: "v1beta1": must appear in spec.versions **Environment:** k8s 1.14.2 kuberctl 1.14.2 ks 0.13.1 kubeflow 0.4.1 minikube... bug 0.95 True 0 0
1053 Example on EKS: Device or resource busy Hi there, I'm trying... question 0.43 False 0 0
1048 can we add PriorityClassName when we create TF-job Podgroup? i use kube-batch to schedule for tf-job, kube-batch support set the priorityclass of podgroup,... feature_request 0.6 True 1 0
1045 TFjob still running while chief pod is completed Hello, I am using ` kubeflow.org/v1beta2` version and start a TFjob container only one chief... bug 0.74 True 1 0
1039 Is there any document for how to run TFJob in AllReduce Strategy Hello guys, I want to know if there is any document about how to run tfjob in all-reduce strategy? question 0.89 True 2 0
1035 tf-operator version conficts When running command **/opt/kubeflow/tf-operator.v1 -version** inside docker image... feature_request 0.4 False 0 0
1033 add E2E test for gang-scheduling We now support the gang-scheduling with using kube-batchd and PodGroup but we don't have tests for it. feature_request 0.98 True 0 0
1031 gang schedule annotation The annotation is need to be set when use gang scheduler as... feature_request 0.81 True 0 0
1030 [feature] Can we use one headless service for one job? We have ps/worker/chief for one TFJob. And now we create one headless service for one replica. I... feature_request 0.7 True 1 0
1029 Will tf-operator upgrading k8s to 1.13? I'm facing the problem that function **testing.NewPatchSubresourceAction** in... question 0.71 True 1 0
1026 no error log for create tfjob fail use api to create tfjob , i get this error: "create tfjob err,the error is the server rejected... bug 0.57 True 0 0
1024 Creating tfjob in dashboard usability issues - Invalid tfjob configurations do not display any errors. The user has to examine network... bug 0.91 True 0 0
1019 Deleting tf-job through the dashboard is not working When i tried to delete a tf-job through tf-job dashboard. I get a message saying "Are you sure... bug 0.95 True 1 0
1016 Create common CRD validate and mutating webhook for all operator If the spec of tfjob is invalid, we should reject the request when creating and also set default... feature_request 0.86 True 0 0
1011 Podgroup is constantly created and deleted after tfjob is success or failure Podgroup is constantly created and deleted after tfjob is success or failure, As shown... bug 0.96 True 0 0
1003 Failed to update TFJob status in version v1 I'm trying v0.5.1, after all pods & services created, there is an error msg: `error syncing... bug 0.85 True 1 0
1000 tfjob startTime should set immediately after create instead of wait pod of one replicaType are all running When I create tfjob with activeDeadlineSeconds but the image address is wrong, the pod of tfjob... bug 0.7 True 0 0
999 Jobs failing when a node is preempted On google kubernetes engine, I am finding that TFJobs fail when a node running a worker is... bug 0.82 True 0 0
997 tf-operator delete pod and service repeatedly tf-operator delete pod and service repeatedly when the tfjob is success or fail or exceed... bug 0.7 True 0 0
996 error with kubeflow instalation im installing toolkit for begin in kubernetes and with command... bug 0.89 True 0 0
994 tf-operator panic when cleanupTFJob tf-operator panic when clean tfjob that exceeds limit, because the `CompletionTime` of tfjob is... bug 0.93 True 0 0
991 Update kustomize files for tf-operator v1 feature_request 0.84 True 0 0
990 Create TFJob v1 documentation feature_request 0.92 True 0 0
989 Create TFJob v1 API and controller from v1beta2 TFjob v1beta2 has been stable for a while. We can create the v1 version now. feature_request 0.96 True 0 0